[Paper Review] Diffusion Schrödinger Bridge with Applications to Score-Based Generative Modeling
This paper reframes score-based generative modeling as a Schrödinger bridge problem and introduces Diffusion Schrödinger Bridge (DSB), an IPF-inspired diffusion method that enables data-to-prior sampling in finite time and provides convergence guarantees.
Progressively applying Gaussian noise transforms complex data distributions to approximately Gaussian. Reversing this dynamic defines a generative model. When the forward noising process is given by a Stochastic Differential Equation (SDE), Song et al. (2021) demonstrate how the time inhomogeneous drift of the associated reverse-time SDE may be estimated using score-matching. A limitation of this approach is that the forward-time SDE must be run for a sufficiently long time for the final distribution to be approximately Gaussian. In contrast, solving the Schrödinger Bridge problem (SB), i.e. an entropy-regularized optimal transport problem on path spaces, yields diffusions which generate samples from the data distribution in finite time. We present Diffusion SB (DSB), an original approximation of the Iterative Proportional Fitting (IPF) procedure to solve the SB problem, and provide theoretical analysis along with generative modeling experiments. The first DSB iteration recovers the methodology proposed by Song et al. (2021), with the flexibility of using shorter time intervals, as subsequent DSB iterations reduce the discrepancy between the final-time marginal of the forward (resp. backward) SDE with respect to the prior (resp. data) distribution. Beyond generative modeling, DSB offers a widely applicable computational optimal transport tool as the continuous state-space analogue of the popular Sinkhorn algorithm (Cuturi, 2013).
Motivation & Objective
- Motivate generative modeling as a Schrödinger bridge problem to overcome the need for long forward-time diffusion.
- Develop a tractable continuous-time and iterative framework (DSB) that solves the Schrödinger bridge via score-based diffusion.
- Provide theoretical convergence results for IPF in continuous state-space relevant to SGM.
- Demonstrate generative modeling capabilities on standard image datasets and show interpolation between data distributions.
Proposed method
- Formulate forward and reverse-time diffusions as SDEs and derive their continuous-time limits.
- Approximate the Schrödinger bridge using Iterative Proportional Fitting (IPF) in a continuous-state setting.
- Introduce Diffusion Schrödinger Bridge (DSB) as a practical IPF-like procedure that alternates refining forward and reverse transitions with score-matching.
- Use neural networks to approximate score functions and drift corrections via regression losses (equations (26) and (27)).
- Provide a theoretical convergence analysis showing total-variation bounds and IPF monotonicity under mild assumptions (Theorem 1 and related propositions).
- Demonstrate sampling from data with shorter time intervals and show interpolation between data distributions.
Experimental results
Research questions
- RQ1Can generative modeling be framed as solving a Schrödinger bridge between data and a prior distribution?
- RQ2Does IPF in continuous state-space yield convergent diffusion paths that approximate the data distribution in finite time?
- RQ3How can score-based diffusion be integrated into a Schrödinger bridge framework using neural network score estimators?
- RQ4What are the convergence properties and rates for continuous IPF in this setting?
- RQ5Do multi-iteration DSB procedures improve data-marginal alignment and enable data interpolation?
Key findings
- DSB provides a finite-time diffusion-based solution to the Schrödinger bridge problem that improves upon traditional long-time forward diffusion.
- The first DSB iteration recovers Song et al. (2021) methodology but with the flexibility of shorter time intervals; subsequent iterations further reduce discrepancies between final marginals and target distributions.
- The paper provides quantitative convergence results for IPF in continuous state-space without relying on compactness, and proves monotonicity in KL and total variation of iterates.
- DSB can be viewed as a continuous-time IPF, with a practical algorithm (Algorithm 1) that alternates forward and backward network updates to approximate the bridge.
- Experiments demonstrate image generation on MNIST and CelebA, and show that multiple DSB steps consistently improve generative performance and enable interpolation between data distributions.
- The framework offers a continuous-state analogue of the Sinkhorn algorithm for computational optimal transport.
Better researchstarts right now
From paper design to paper writing, dramatically reduce your research time.
No credit card · Free plan available
This review was created by AI and reviewed by human editors.