[Paper Review] Improving Generative Model-based Unfolding with Schrödinger Bridges
SBUnfold uses Schrödinger Bridges to learn a diffusion-based generative mapping for unfolding, combining strengths of OmniFold and IcINN, and shows improved performance over state-of-the-art methods on a synthetic Z+jets dataset, especially with limited data.
Machine learning-based unfolding has enabled unbinned and high-dimensional differential cross section measurements. Two main approaches have emerged in this research area: one based on discriminative models and one based on generative models. The main advantage of discriminative models is that they learn a small correction to a starting simulation while generative models scale better to regions of phase space with little data. We propose to use Schroedinger Bridges and diffusion models to create SBUnfold, an unfolding approach that combines the strengths of both discriminative and generative models. The key feature of SBUnfold is that its generative model maps one set of events into another without having to go through a known probability density as is the case for normalizing flows and standard diffusion models. We show that SBUnfold achieves excellent performance compared to state of the art methods on a synthetic Z+jets dataset.
Motivation & Objective
- Motivate and improve machine learning-based unfolding for differential cross sections by combining discriminative and generative model strengths.
- Develop SBUnfold using Schrödinger Bridges to map between datasets without requiring a known source density.
- Assess SBUnfold against OmniFold and IcINN (cINN) on a synthetic Z+jets dataset.
- Evaluate performance with varying data availability to test robustness and data-efficiency.
Proposed method
- Describe Schrödinger Bridges and their relation to diffusion models and SB theory.
- Integrate SB into an unfolding workflow by replacing the E-step of IcINN with a Schrödinger Bridge-based transport.
- Use a diffusion process to denoise detector-level observables toward generator-level distributions starting from reconstructed-level inputs.
- Compare stochastic and deterministic (ODE) sampling regimes and select a deterministic variant for simplicity.
- Train on Pythia-generated simulations and evaluate unfolding quality against a Herwig-based pseudo-data set, using metrics like EMD and triangular discriminator.

Experimental results
Research questions
- RQ1Can Schrödinger Bridges provide a data-efficient, accurate transport between detector- and generator-level distributions in unfolding?
- RQ2Does SBUnfold retain the practical advantages of both OmniFold (data-driven E-step) and IcINN (simulation-driven E-step) while mitigating their weaknesses?
- RQ3How does SBUnfold perform relative to cINN and OmniFold under varying data availability (e.g., reduced data samples)?
Key findings
- SBUnfold consistently achieves lower unbinned earth mover’s distance (EMD) and often lower triangular discriminator values than cINN across multiple jet observables.
- SBUnfold shows improved fidelity for distributions with sharp features and benefits from informative priors derived from reconstructed-level events.
- When data are scarce, SBUnfold demonstrates more robust performance than OmniFold, which can degrade under limited data scenarios.
- Using Herwig as pseudo-data, SBUnfold achieves better agreement with generator-level distributions than cINN and, in several cases, than OmniFold (Step 1) across six jet observables.
- Migration matrices indicate SBUnfold applies small, diagonal-like corrections to reconstructed events, consistent with a mild unfolding.

Better researchstarts right now
From paper design to paper writing, dramatically reduce your research time.
No credit card · Free plan available
This review was created by AI and reviewed by human editors.