DSBM-NeuralODE: Diffusion Bridge ODE
- The paper introduces a novel method that uses neural ODEs to approximate the optimal diffusion Schrödinger bridge, offering a scalable alternative to classical IPF solutions.
- It replaces stochastic differential equations with a deterministic ODE surrogate, enabling the use of high-order adaptive solvers and significantly reducing function evaluations.
- Empirical results on Gaussian transport and MNIST latent tasks show notable efficiency gains and competitive performance compared to traditional IPF-based and SINDy-FM methods.
DSBM-NeuralODE (Diffusion Schrödinger Bridge Matching with Neural ODEs) is a continuous-time generative modeling paradigm that parameterizes the Schrödinger bridge dynamics between two given probability measures via neural ordinary differential equations. Developed as a scalable, flexible, and efficient alternative to classical iterative proportional fitting and stochastic bridge solvers, DSBM-NeuralODE approximates the optimal bridge transport in high-dimensional latent spaces, with significant efficiency and adaptability advantages over baseline methods (Khilchuk et al., 14 Dec 2025).
1. Mathematical Foundations
The classical Schrödinger bridge (SB) seeks a stochastic process on path space that solves: where are prescribed marginals on , and is a reference diffusion law: The optimal SB dynamics can be written as a stochastic differential equation (SDE)
where encodes the correction drift determined by path-space conditional scores. Directly estimating , as in classical iterative proportional fitting (IPF), proves computationally infeasible for high-dimensional applications.
DSBM-NeuralODE replaces the SDE drift with a deterministic ODE surrogate: where is a time- and state-dependent velocity field parameterized by a neural network (“ODEFunc”). At the optimum, this field mimics the mean drift of the optimal SB process in expectation along solution paths (Khilchuk et al., 14 Dec 2025).
2. Training Objectives and Loss Formulation
DSBM-NeuralODE proceeds in two main training phases:
(a) Pre-training on Reference Diffusions:
A forward diffusion, typically with DDPM-style schedule,
is simulated to generate datasets of pairs. The ODE surrogate is initially trained by minimizing
An analogous backward model is trained in the reverse direction.
(b) Iterative Schrödinger Bridge Matching (SBM):
Given endpoint pairs , intermediate bridge states are sampled using Brownian bridge interpolation. At iteration , for direction , the target velocity is constructed as
and the main loss is
Alternating minimization over forward and backward networks establishes the Iterative Markovian Fitting (IMF) process (Khilchuk et al., 14 Dec 2025, Shi et al., 2023).
3. Algorithmic Implementation
Below is the canonical DSBM-NeuralODE workflow:
- Pre-training
- Simulate diffusion trajectories; collect consecutive pairs.
- Fit and via their respective regression losses.
- Initialization
- Set initial coupling by sampling , .
- Iterative Matching (for to )
- Sample minibatches of endpoint pairs from .
- For each direction :
- Sample , generate interpolated state .
- Compute and update the ODE network by gradient steps.
- Update coupling by propagating samples with the learned ODE or SDE.
- Inference (Sampling)
- Sample , integrate
from to $1$ using an adaptive ODE solver or Euler–Maruyama (Khilchuk et al., 14 Dec 2025, Shi et al., 2023).
4. Architecture and Design Choices
The velocity field is parameterized by a multilayer perceptron (MLP) with 2 hidden layers. For Gaussian transport tasks, widths are set to [64, 64] with ReLU activation; for MNIST latent translation, [128, 128] with Swish activations are used. The input consists of the state vector concatenated with a positional encoding of time . Regularization employs weight decay of (no dropout), with Adam optimizer and initial learning rate . The parameter count for DSBM-NeuralODE per direction is approximately for both Gaussian and MNIST tasks (Khilchuk et al., 14 Dec 2025).
5. Efficiency, Interpretability, and Empirical Results
DSBM-NeuralODE leverages the deterministic ODE formulation to enable high-order adaptive solvers (e.g., Dormand-Prince), reducing the required number of function evaluations (NFEs) by 5–10 compared to fixed-step SDE samplers. On Gaussian transport, $1,000$ samples are generated in around $10$ seconds on CPU—yielding a speedup over IPF-based diffusion bridge methods ( s). The ODE surrogate’s smoothness in time facilitates more stable integration and visualization diagnostics relative to conventional SDE approaches.
The method remains less interpretable than symbolic SINDy-FM surrogates (which enable near-instantaneous inference and sparse models), but interpretability can be partially recovered through feature attribution and sensitivity analysis tools.
Empirical benchmarks demonstrate:
- Gaussian transport (): DSBM-NeuralODE achieves with training/inference times of $2,326$ s/$21.8$ s and parameters. Baseline DSBM (IPF) yields at $90$ s/$0.08$ s and parameters.
- MNIST latent translation ($8$-dim VAE): DSBM-NeuralODE attains FID = 72.2, Inception Score = 1.47, digit accuracy = 0.912, training = 450 s, inference = 0.08 s (Khilchuk et al., 14 Dec 2025).
In both cases, SINDy-FM achieves close performance with far fewer parameters and faster inference, but cannot match DSBM-NeuralODE for tasks requiring more expressive non-linear bridge dynamics.
| Task | Model | / FID / IS | Train s | Infer s | Params |
|---|---|---|---|---|---|
| Gaussian transport | DSBM-NeuralODE | 2326 | 21.8 | ||
| Gaussian transport | DSBM (IPF) | 90 | 0.08 | ||
| MNIST latent, 23 | DSBM-NeuralODE | FID=72.2, IS=1.47 | 450 | 0.08 | |
| MNIST latent, 23 | SINDy-FM | FID83–89 | -- | 0.001 | 541–923 |
6. Connections to Unified Bridge Paradigms and Theoretical Guarantees
DSBM-NeuralODE belongs to the broader class of unified bridge algorithms (UBA), which encompasses:
- DSBM (Schrödinger Bridge Matching): SDE with nonzero reference noise .
- Flow Matching: ODE (zero noise limit, ) as in Benamou–Brenier optimal transport.
Both DSBM and flow matching minimize conditional MSE losses over “pinned” processes interpolating and ; the difference lies in the level of stochasticity and choice of process path law (Kim, 27 Mar 2025).
Theoretical results guarantee:
- Each DSBM iteration decreases ; in the limit, convergence to the true bridge.
- As , SB solutions converge to the minimal-kinetic optimal transport solution (Benamou–Brenier flow), recovered by flow-matching objectives.
- Universal approximation: Any time-state drift is representable in a single iteration by the ODE surrogate, assuming sufficient model capacity and minimization accuracy (Kim, 27 Mar 2025, Khilchuk et al., 14 Dec 2025).
7. Limitations and Applicability Spectrum
DSBM-NeuralODE offers a balance between sample efficiency, expressiveness, and computational tractability. The ODE formulation enables advanced solvers and substantial speedups but is less interpretable and, due to overparameterization, can entail higher training costs. The method is best suited when high-fidelity reconstruction of non-linear bridge dynamics is essential. SINDy-FM remains preferable when interpretability and minimal parameterization are paramount, while classical IPF or SDE-based approaches may still be optimal for low-dimensional or limited-scale scenarios (Khilchuk et al., 14 Dec 2025).