Adjoint Schrödinger Bridge Matching
- Adjoint Schrödinger Bridge Matching (ASBM) is a generative modeling framework that formulates entropy-regularized optimal transport as a stochastic control problem for kinetic energy minimization.
- It employs efficient adjoint and corrector matching, using neural regression and local losses to reduce noise and computational cost during high-dimensional sampling.
- ASBM supports continuous and discrete domains with strong convergence guarantees, improving sample quality, scalability, and path optimality in practical applications.
Adjoint Schrödinger Bridge Matching (ASBM) defines a family of scalable algorithms for generative modeling and sampling, built on the Schrödinger Bridge (SB) formulation of entropy-regularized optimal transport. ASBM efficiently computes cost-optimal, kinetic-energy minimizing couplings between complex high-dimensional source and target distributions by leveraging controlled stochastic processes, adjoint-based matching, and, in discrete-time variants, adversarial learning. The methodology generalizes beyond memoryless diffusion and supports both continuous and discrete state spaces, offering rigorously justified global convergence and empirical advances in sample efficiency, scalability, and path optimality.
1. Schrödinger Bridge Formulation and Stochastic Optimal Control
ASBM is grounded in the Schrödinger Bridge problem, which seeks the path-measure of a controlled process minimizing the relative entropy to an uncontrolled reference path-measure , subject to prescribed initial and terminal marginals , . For an Itô diffusion,
the SB problem is
with . This yields a unique minimum-kinetic-energy interpolant between marginals via optimal drift , expressible with SB "potentials" , solving forward-backward heat-type equations:
0
Where 1 is an intractable energy-based law (e.g., Boltzmann 2), the problem may be equivalently posed as a stochastic optimal control (SOC):
3
where 4 is the endpoint marginal of the base process. The unique minimizer again coincides with the kinetic-optimal drift, linking SB directly to optimal-control sampling frameworks (Liu et al., 27 Jun 2025, Shin et al., 17 Feb 2026).
2. Adjoint Matching Principles and Algorithmic Implementation
Central to ASBM is adjoint matching, which replaces slow and unstable score evaluation and long SDE backpropagation with local, efficient losses based on backward adjoint processes. Defining the adjoint process 5, the matching objective for the drift 6 becomes
7
Practical training alternates two losses:
- Adjoint Matching (AM): Minimizes 8 over samples simulated under the current drift.
- Corrector Matching (CM): Regresses 9 to target scores 0, approximating the unknown endpoint score.
In discrete-time and discrete-space settings, this paradigm is adapted using Markov Chain dynamics and fixed-point equations with Bregman divergences for controller and corrector potential ratios, exploiting cyclic-group structure for tractability (Guo et al., 9 Feb 2026).
The ASBM (or ASBS) algorithm proceeds as alternating AM/CM steps, each involving simulation of short SDE or CTMC bridges, endpoint target computation, and local neural regression. This configuration instantiates Iterative Proportional Fitting (IPF) in path space, inheriting its geometric convergence properties (Liu et al., 27 Jun 2025).
3. Non-Memoryless Regimes and Practical Advantages
Classical diffusion models operate in a memoryless regime, coupling 1, 2 independently, resulting in highly curved trajectories, noisy score targets, and inefficient sampling. ASBM leverages the reciprocal coupling property of the Schrödinger Bridge: by simulating the optimal endpoint coupling and reconstructing trajectories via base process kernels 3, the method induces much straighter, energetically optimal sampling paths.
ASBM thus supports:
- Arbitrary source distributions (e.g., Gaussian, learned priors, flows): the memoryless condition is fully relaxed.
- No importance weighting or expensive energy/model evaluations: all training uses on-policy Monte Carlo bridges.
- Reduction in number of function evaluations (NFEs) at sampling and training time, with only 4 compute per stage (Liu et al., 27 Jun 2025, Shin et al., 17 Feb 2026). Empirically, this yields significantly improved sample quality, stability, and scaling to high-dimensional data and energy-based targets.
4. Discrete and Adversarial Variants
ASBM extends naturally to discrete state spaces by constructing reference CTMCs (often with an additive cyclic group structure) and leveraging discrete potential recurrences. A controller/corrector architecture alternates updates for controller potential ratios and corrector potential ratios, optimized via Bregman regression (Guo et al., 9 Feb 2026).
An orthogonal direction replaces the adjoint-matching loss with adversarial learning: Adversarial Schrödinger Bridge Matching (also “ASBM”) aligns with the Discrete-time Iterative Markovian Fitting (D-IMF) approach (Gushchin et al., 2024). Here, learning proceeds by Markovian and reciprocal projections alternated in discrete time, trained end-to-end via a DD-GAN estimator. This achieves similar empirical performance as continuous-time methods with only a handful of generation steps, substantially accelerating inference.
| Variant | Domain | Core Algorithmic Feature |
|---|---|---|
| ASBM/ASBS | Continuous | Alternating adjoint/corrector matching via SDE simulation |
| Discrete ASBM | Discrete | Cyclic group CTMC, Bregman matching |
| Adversarial ASBM | Discrete/Continuous | Alternating GAN-based Markovian/reciprocal projections |
5. Theoretical Properties and Convergence Guarantees
ASBM and its variants possess strong optimality and convergence properties. Under mild regularity and uniqueness of minimizers for the adjoint and corrector objectives, the alternating training process (adjoint + corrector matching) implements path-space IPF, which is known to converge geometrically to the unique Schrödinger Bridge drift 5 (Liu et al., 27 Jun 2025, Guo et al., 9 Feb 2026). Proofs leverage the connection between half-bridge KL minimization steps and the full bridge global optimum (static or dynamic), and in the discrete case, discrete IPF theory provides analogous guarantees.
In adversarial and discrete regimes, discrete-time analogues of the KL–Pythagorean theorem guarantee monotonic KL decrease over alternations, and exponential convergence rates are established in Gaussian cases (Gushchin et al., 2024). The discrete extension critically depends on the existence of a suitable cyclic group structure in the state space for fixed-point adjoint matching.
6. Empirical Performance and Benchmarking
ASBM demonstrates state-of-the-art empirical performance across several families of tasks:
- Synthetic Energy-Based Models: ASBM achieves lower Sinkhorn and Wasserstein distances on multi-well, double-well, Lennard-Jones benchmarks versus PIS, DDS, LV-PIS, iDEM, and memoryless Adjoint Samplers. For instance, on MW-13, the 6 drops from ≈1.67→1.59, and the energy-W2 from ≈2.40→1.28 (Liu et al., 27 Jun 2025).
- Molecular Generative Modeling: In conformer generation and sampling from molecular Boltzmann distributions, ASBM achieves higher recall, coverage, and lower RMSD versus RDKit and prior adjoint baselines, with all gains attributed to the bridge method, not architectural choices.
- Image Generation: On CIFAR-10, ASBM attains FID 3.16 at 100 NFEs, outperforming Score-SDE (4.61), SB-FBSDE (5.26), VSDM (4.24), and DSBM (9.68). In the low-NFE regime (T=25), ASBM achieves FID ≈8.85 versus Score-SDE ≈52.1. Similar trends hold in Stable-Diffusion latents (Shin et al., 17 Feb 2026).
- Discrete Lattice Models: On Ising and Potts benchmarks, discrete ASBM achieves magnetization error, correlation error, and energy-Wasserstein-2 distances that match or exceed discrete uniform samplers, at considerably lower training cost (Guo et al., 9 Feb 2026).
- Adversarial Translation: Adversarial ASBM achieves FID ≈16–18 on CelebA (male→female, N=3 steps) versus DSBM’s FID ≈38–90 with 100 steps; for Color-MNIST, ASBM (4 NFE) outperforms DSBM (100 NFE) (Gushchin et al., 2024).
One-step generator distillation is enabled by ASBM’s straighter, low-variance trajectories, yielding FID 6.68 and recall 0.542 on CIFAR-10, both surpassing SDS and DMD baselines (Shin et al., 17 Feb 2026).
7. Extensions, Scalability, and Applications
ASBM is scalable to high-dimensional tasks by virtue of its path-wise, memory-efficient adjoint-gradient computations. It avoids storing full trajectories and does not require importance sampling or explicit evaluation of density ratios. Applications demonstrated in the literature include molecular structure sampling, image generation with few sampling steps, discrete combinatorial problem optimization (e.g., low-temperature Ising), and plug-and-play amortized conditional sampling in structural biology and drug design.
The algorithmic suite is adaptable to any reference chain or base process admitting efficient simulation and supports the use of learned or arbitrarily complex priors and energy functions. Discrete ASBM provides a unified framework for complex-value generative modeling and optimization in combinatorial spaces.
Adjoint Schrödinger Bridge Matching synthesizes stochastic optimal control with modern generative modeling and path-space transport, yielding globally optimal, scalable, and empirically validated solutions for sampling, generative modeling, and high-dimensional transport in both continuous and discrete domains (Liu et al., 27 Jun 2025, Shin et al., 17 Feb 2026, Guo et al., 9 Feb 2026, Gushchin et al., 2024).