Papers
Topics
Authors
Recent
Search
2000 character limit reached

Data-to-Energy Schrödinger Bridge Training

Updated 4 July 2026
  • Data-to-Energy Schrödinger Bridge Training is a stochastic method that transports empirical data to an energy-defined target by minimizing path-space relative entropy.
  • It employs iterative proportional fitting and matching-based solvers to derive Schrödinger potentials and control policies without relying on explicit target samples.
  • Empirical results demonstrate that non-memoryless couplings yield straighter trajectories and reduced transport costs, outperforming traditional score-based diffusion approaches.

Data-to-Energy Schrödinger Bridge Training is a class of methods for learning a stochastic process that transports a source distribution available through data samples to a target distribution specified only through an unnormalized energy, typically p1(x)exp(E(x))p_{1}(x)\propto \exp(-E(x)), while minimizing a path-space relative entropy with respect to a reference diffusion. The central difficulty is the absence of target samples: the terminal law is known through E(x)E(x) but not through an empirical dataset. Recent work addresses this by recasting the forward bridge as stochastic optimal control, deriving matching-based or iterative proportional fitting procedures that operate directly on the energy function, and exploiting non-memoryless couplings to obtain straighter and more efficient trajectories than ordinary diffusion training (Shin et al., 17 Feb 2026, Tamogashev et al., 30 Sep 2025).

1. Problem formulation and scope

In the canonical data-to-energy setting, one starts from a base SDE

dXt=ft(Xt)dt+σtdWt,X0pdata,dX_t = f_t(X_t)\,dt + \sigma_t\,dW_t,\qquad X_0\sim p_{\mathrm{data}},

introduces a control ut(x)u_t(x), and considers the controlled process

dXt=[ft(Xt)+σtut(Xt)]dt+σtdWt,dX_t = [f_t(X_t)+\sigma_t u_t(X_t)]\,dt + \sigma_t\,dW_t,

with the endpoint constraint

X1pprior(x)exp(E(x)).X_1 \sim p_{\mathrm{prior}}(x)\propto \exp(-E(x)).

The associated Schrödinger bridge problem minimizes the relative entropy between the controlled path measure and the base path measure, equivalently minimizing

Epu[1201ut(Xt)2dt]\mathbb E_{p^u}\Bigl[\frac12\int_0^1 \|u_t(X_t)\|^2\,dt\Bigr]

subject to the endpoint marginals (Shin et al., 17 Feb 2026).

This formulation differs sharply from sample-to-sample Schrödinger bridge estimation. In the data-driven setting of Pavon–Tabak–Trigila, both endpoint marginals are represented by empirical measures, and training solves a sample-based Schrödinger system or its neural approximation (Pavon et al., 2018). The empirical-risk formulation of Belomestny–Naumov–Puchkin–Suchkov likewise assumes samples from ρ0\rho_0 and ρT\rho_T and estimates a transformed potential by minimizing an empirical fixed-point loss (Belomestny et al., 9 Feb 2026). Data-to-energy training removes one of these sample sets and replaces it with an energy oracle.

A broader formulation allows one or both marginals to be known only up to unnormalized densities. In that terminology, the present case is “data-to-energy,” where only E1\mathcal E_1 is available, while “energy-to-energy” denotes the case in which both marginals are specified by energies rather than samples (Tamogashev et al., 30 Sep 2025).

2. Variational structure and stochastic optimal control

A central observation is that the data-to-energy bridge can be written as a stochastic optimal control problem with terminal cost

E(x)E(x)0

The value function

E(x)E(x)1

satisfies the Hamilton–Jacobi–Bellman equation

E(x)E(x)2

and the optimal control is

E(x)E(x)3

If one defines E(x)E(x)4, then E(x)E(x)5 solves the forward Schrödinger integral equation, linking the control formulation directly to Schrödinger bridge potentials (Shin et al., 17 Feb 2026).

The same object may be described by forward and backward Schrödinger potentials E(x)E(x)6, with optimal controls

E(x)E(x)7

This forward/backward factorization is the basis of alternating half-bridge updates, bridge matching, and iterative proportional fitting constructions (Shin et al., 17 Feb 2026).

In the Boltzmann-sampling setting, where the terminal law is E(x)E(x)8, the path-space KL also admits a static form

E(x)E(x)9

so the terminal correction appears explicitly as a density-ratio term enforcing dXt=ft(Xt)dt+σtdWt,X0pdata,dX_t = f_t(X_t)\,dt + \sigma_t\,dW_t,\qquad X_0\sim p_{\mathrm{data}},0 (Liu et al., 27 Jun 2025).

A related limiting statement appears in data-to-energy stochastic dynamics: when the bridge drift dXt=ft(Xt)dt+σtdWt,X0pdata,dX_t = f_t(X_t)\,dt + \sigma_t\,dW_t,\qquad X_0\sim p_{\mathrm{data}},1 is compared to a reference drift dXt=ft(Xt)dt+σtdWt,X0pdata,dX_t = f_t(X_t)\,dt + \sigma_t\,dW_t,\qquad X_0\sim p_{\mathrm{data}},2,

dXt=ft(Xt)dt+σtdWt,X0pdata,dX_t = f_t(X_t)\,dt + \sigma_t\,dW_t,\qquad X_0\sim p_{\mathrm{data}},3

and as dXt=ft(Xt)dt+σtdWt,X0pdata,dX_t = f_t(X_t)\,dt + \sigma_t\,dW_t,\qquad X_0\sim p_{\mathrm{data}},4 this recovers dynamic optimal transport with squared-Euclidean cost (Tamogashev et al., 30 Sep 2025).

3. Training objectives and algorithmic families

Recent methods differ primarily in how they avoid explicit target samples while still identifying the optimal bridge or a close approximation.

Framework Core training decomposition Distinctive feature
ASBM Stage 1: Adjoint Matching + Corrector Matching; Stage 2: backward bridge matching data-to-energy forward learning, then generative reverse dynamics
LightSB-M single bridge matching from a reciprocal process arbitrary transport plan input; equivalence to LightSB/EgNOT objective
Data-to-Energy Stochastic Dynamics generalized IPF with backward ML and forward VarGrad off-policy RL formulation; learned diffusion coefficient
ASBS alternating Adjoint Matching and Corrector Matching arbitrary source distributions; no target-sample estimation during training

In ASBM, the first stage learns the forward Schrödinger bridge control without ever constructing dXt=ft(Xt)dt+σtdWt,X0pdata,dX_t = f_t(X_t)\,dt + \sigma_t\,dW_t,\qquad X_0\sim p_{\mathrm{data}},5. The Adjoint Matching loss regresses the forward control against a terminal energy-gradient target plus a proxy terminal corrector,

dXt=ft(Xt)dt+σtdWt,X0pdata,dX_t = f_t(X_t)\,dt + \sigma_t\,dW_t,\qquad X_0\sim p_{\mathrm{data}},6

while Corrector Matching learns the terminal proxy by

dXt=ft(Xt)dt+σtdWt,X0pdata,dX_t = f_t(X_t)\,dt + \sigma_t\,dW_t,\qquad X_0\sim p_{\mathrm{data}},7

After the forward model has converged and induced an approximate optimal coupling dXt=ft(Xt)dt+σtdWt,X0pdata,dX_t = f_t(X_t)\,dt + \sigma_t\,dW_t,\qquad X_0\sim p_{\mathrm{data}},8, stage 2 learns the reverse-time drift by bridge matching,

dXt=ft(Xt)dt+σtdWt,X0pdata,dX_t = f_t(X_t)\,dt + \sigma_t\,dW_t,\qquad X_0\sim p_{\mathrm{data}},9

The paper emphasizes that the entire first stage uses only the forward simulation under ut(x)u_t(x)0 (Shin et al., 17 Feb 2026).

LightSB-M, introduced by Korotin et al., starts from a reciprocal process ut(x)u_t(x)1 whose endpoint coupling is an arbitrary ut(x)u_t(x)2, and performs a single KL projection onto the manifold of Schrödinger bridges. The learned object is an adjusted Schrödinger potential ut(x)u_t(x)3, and the energy-based objective

ut(x)u_t(x)4

coincides, up to additive constants, with both ut(x)u_t(x)5 and the optimal bridge-matching objective. The practical LightSB-M solver uses a Gaussian-mixture parameterization

ut(x)u_t(x)6

which yields closed-form normalizers, drifts, and conditional samplers (Gushchin et al., 2024).

Data-to-Energy Stochastic Dynamics generalizes path-space IPF to the case where the final marginal is known only through an energy. A backward half-bridge is fitted by maximum likelihood on reverse trajectories, while the forward half-bridge replaces unavailable target-sample likelihoods with a conditional variance objective,

ut(x)u_t(x)7

Training is further recast as a finite-horizon MDP with on-policy and off-policy trajectory mixtures, replay buffers, and Langevin refinement (Tamogashev et al., 30 Sep 2025).

ASBS is an allied method for learning to sample from Boltzmann distributions when the source is a simple prior rather than an empirical dataset. It alternates Adjoint Matching and Corrector Matching, avoids importance-weighted estimation and target-sample estimation during training, and proves convergence to the unique global solution under mild smoothness and expressivity assumptions. The framework generalizes recent Adjoint Sampling of Havens et al. by relaxing the memoryless condition to arbitrary source distributions (Liu et al., 27 Jun 2025).

Implementation practice is correspondingly heterogeneous. In the ASBS setting, reported practical choices include unit time horizon, a geometric noise schedule with ut(x)u_t(x)8 and ut(x)u_t(x)9, 50–100 Euler steps, replay buffers of dXt=[ft(Xt)+σtut(Xt)]dt+σtdWt,dX_t = [f_t(X_t)+\sigma_t u_t(X_t)]\,dt + \sigma_t\,dW_t,0 sample pairs, dXt=[ft(Xt)+σtut(Xt)]dt+σtdWt,dX_t = [f_t(X_t)+\sigma_t u_t(X_t)]\,dt + \sigma_t\,dW_t,1 clipping to norm dXt=[ft(Xt)+σtut(Xt)]dt+σtdWt,dX_t = [f_t(X_t)+\sigma_t u_t(X_t)]\,dt + \sigma_t\,dW_t,2, and alternating Adjoint Matching and Corrector Matching for 5–20 stages with 50–200 gradient steps per stage (Liu et al., 27 Jun 2025).

4. Couplings, non-memorylessness, and trajectory geometry

A defining issue in data-to-energy Schrödinger bridge training is whether the reference construction is memoryless. ASBM argues that ordinary diffusion models inherit highly curved trajectories and noisy score targets from an uninformative memoryless forward process that induces independent data-noise coupling. In the memoryless case, dXt=[ft(Xt)+σtut(Xt)]dt+σtdWt,dX_t = [f_t(X_t)+\sigma_t u_t(X_t)]\,dt + \sigma_t\,dW_t,3, the SB-optimal coupling becomes independent, and the backward drift reduces to the classical score SDE

dXt=[ft(Xt)+σtut(Xt)]dt+σtdWt,dX_t = [f_t(X_t)+\sigma_t u_t(X_t)]\,dt + \sigma_t\,dW_t,4

which “forgets” dXt=[ft(Xt)+σtut(Xt)]dt+σtdWt,dX_t = [f_t(X_t)+\sigma_t u_t(X_t)]\,dt + \sigma_t\,dW_t,5 entirely and injects maximum noise (Shin et al., 17 Feb 2026).

The non-memoryless alternative preserves endpoint dependence through the backward potential dXt=[ft(Xt)+σtut(Xt)]dt+σtdWt,dX_t = [f_t(X_t)+\sigma_t u_t(X_t)]\,dt + \sigma_t\,dW_t,6. ASBM writes the optimal non-memoryless backward drift as

dXt=[ft(Xt)+σtut(Xt)]dt+σtdWt,dX_t = [f_t(X_t)+\sigma_t u_t(X_t)]\,dt + \sigma_t\,dW_t,7

and reports that the process “remembers” its endpoint and travels in a near-straight line in state-time. Empirically, this reduces the transport cost

dXt=[ft(Xt)+σtut(Xt)]dt+σtdWt,dX_t = [f_t(X_t)+\sigma_t u_t(X_t)]\,dt + \sigma_t\,dW_t,8

by a factor dXt=[ft(Xt)+σtut(Xt)]dt+σtdWt,dX_t = [f_t(X_t)+\sigma_t u_t(X_t)]\,dt + \sigma_t\,dW_t,9 on CIFAR10, reduces the path-straightness ratio X1pprior(x)exp(E(x)).X_1 \sim p_{\mathrm{prior}}(x)\propto \exp(-E(x)).0 by X1pprior(x)exp(E(x)).X_1 \sim p_{\mathrm{prior}}(x)\propto \exp(-E(x)).1 relative to score SDE, and halves trajectory variance when measured using 10 samples per X1pprior(x)exp(E(x)).X_1 \sim p_{\mathrm{prior}}(x)\propto \exp(-E(x)).2 (Shin et al., 17 Feb 2026).

LightSB-M supplies a complementary geometric statement. Its optimal projection theorem asserts that for any reciprocal process X1pprior(x)exp(E(x)).X_1 \sim p_{\mathrm{prior}}(x)\propto \exp(-E(x)).3 with coupling X1pprior(x)exp(E(x)).X_1 \sim p_{\mathrm{prior}}(x)\propto \exp(-E(x)).4,

X1pprior(x)exp(E(x)).X_1 \sim p_{\mathrm{prior}}(x)\propto \exp(-E(x)).5

so a single projection onto the Schrödinger-bridge manifold returns the true bridge. The associated tractable matching loss depends on the drift mismatch between the candidate SB drift and the Brownian-bridge target X1pprior(x)exp(E(x)).X_1 \sim p_{\mathrm{prior}}(x)\propto \exp(-E(x)).6, and the paper shows that this MSE-type loss differs from the LightSB/EgNOT objective only by an additive constant (Gushchin et al., 2024).

These results are often read as correcting a common misconception: matching-based Schrödinger bridge training is not restricted to iterative procedures that must accumulate transport-plan error. Under the assumptions stated in LightSB-M, arbitrary transport plans can be used as inputs to a single optimal bridge-matching step (Gushchin et al., 2024).

5. Empirical behavior across domains

On image generation, ASBM reports substantial gains in low-NFE regimes. On CIFAR-10 with 100 NFE and a VP schedule, the reported FID values are Score SDE X1pprior(x)exp(E(x)).X_1 \sim p_{\mathrm{prior}}(x)\propto \exp(-E(x)).7, SB-FBSDE X1pprior(x)exp(E(x)).X_1 \sim p_{\mathrm{prior}}(x)\propto \exp(-E(x)).8, VSDM X1pprior(x)exp(E(x)).X_1 \sim p_{\mathrm{prior}}(x)\propto \exp(-E(x)).9, and ASBM Epu[1201ut(Xt)2dt]\mathbb E_{p^u}\Bigl[\frac12\int_0^1 \|u_t(X_t)\|^2\,dt\Bigr]0. At 25 NFE, the comparison is Score SDE Epu[1201ut(Xt)2dt]\mathbb E_{p^u}\Bigl[\frac12\int_0^1 \|u_t(X_t)\|^2\,dt\Bigr]1 versus ASBM Epu[1201ut(Xt)2dt]\mathbb E_{p^u}\Bigl[\frac12\int_0^1 \|u_t(X_t)\|^2\,dt\Bigr]2. The FID-vs-NFE curve is reported to plateau at approximately Epu[1201ut(Xt)2dt]\mathbb E_{p^u}\Bigl[\frac12\int_0^1 \|u_t(X_t)\|^2\,dt\Bigr]3 for ASBM by 1000 NFE, compared with approximately Epu[1201ut(Xt)2dt]\mathbb E_{p^u}\Bigl[\frac12\int_0^1 \|u_t(X_t)\|^2\,dt\Bigr]4 for score SDE. On FFHQ-latent, the reported values are Epu[1201ut(Xt)2dt]\mathbb E_{p^u}\Bigl[\frac12\int_0^1 \|u_t(X_t)\|^2\,dt\Bigr]5 at 50 NFE, Epu[1201ut(Xt)2dt]\mathbb E_{p^u}\Bigl[\frac12\int_0^1 \|u_t(X_t)\|^2\,dt\Bigr]6 at 100 NFE, and Epu[1201ut(Xt)2dt]\mathbb E_{p^u}\Bigl[\frac12\int_0^1 \|u_t(X_t)\|^2\,dt\Bigr]7 at 500 NFE. For one-step distillation on CIFAR-10, the paper reports SDS Epu[1201ut(Xt)2dt]\mathbb E_{p^u}\Bigl[\frac12\int_0^1 \|u_t(X_t)\|^2\,dt\Bigr]8 with recall Epu[1201ut(Xt)2dt]\mathbb E_{p^u}\Bigl[\frac12\int_0^1 \|u_t(X_t)\|^2\,dt\Bigr]9 and precision ρ0\rho_00, DMD ρ0\rho_01 with ρ0\rho_02, and ASBM ρ0\rho_03 with ρ0\rho_04. The same study states that forward NFE ρ0\rho_05 suffices and that even 10 forward steps still outperform score SDE with 100 steps (Shin et al., 17 Feb 2026).

On classical Schrödinger-bridge benchmarks, LightSB-M reports cB ρ0\rho_06-UVP errors of approximately ρ0\rho_07 at ρ0\rho_08, compared with LightSB ρ0\rho_09, DSBM ρT\rho_T0, and SFρT\rho_T1M ρT\rho_T2. At ρT\rho_T3, the reported values are LightSB-M ρT\rho_T4, LightSB ρT\rho_T5, DSBM ρT\rho_T6, and SFρT\rho_T7M ρT\rho_T8. On single-cell trajectory inference, reported energy distances include ρT\rho_T9 at dimension E1\mathcal E_10 versus DSBM E1\mathcal E_11, SFE1\mathcal E_12M E1\mathcal E_13, and LightSB E1\mathcal E_14, and E1\mathcal E_15 at dimension E1\mathcal E_16 versus DSBM E1\mathcal E_17, SFE1\mathcal E_18M E1\mathcal E_19, and LightSB E(x)E(x)00. The reported CPU training times are approximately E(x)E(x)01 seconds for LightSB-M versus approximately 6 minutes on GPU for DSBM (Gushchin et al., 2024).

Data-to-Energy Stochastic Dynamics evaluates the explicitly sample-free terminal setting on “Gauss E(x)E(x)02 GMM” and “Two Moons E(x)E(x)03 GMM” tasks in E(x)E(x)04, reporting E(x)E(x)05 and path-KL on par with data-to-data SB even though the GMM endpoint is accessed only through its energy. The same study reports that learning the diffusion coefficient improves several IPF-based baselines by E(x)E(x)06 in path-KL and Wasserstein cost when using E(x)E(x)07 or fewer discretization steps. In latent posterior sampling, the method produces semantic content-preserving image-to-image translations, and the reported FID between SB-transported samples and real-class images is often better than that obtained by simple rejection sampling of the latent space (Tamogashev et al., 30 Sep 2025).

In the allied source-to-energy regime, ASBS reports that it halves or better the previous best E(x)E(x)08-Wasserstein errors on MW-5, DW-4, LJ-13, and LJ-55; achieves the lowest KL on each of five alanine torsion marginals and the lowest E(x)E(x)09 error on the 2D Ramachandran plot; and attains 70–75% coverage without relaxation and approximately 90% with relaxation on amortized conformer generation, compared with approximately 57% for AS (Liu et al., 27 Jun 2025).

6. Relation to adjacent Schrödinger-bridge literature

The data-to-energy literature sits alongside, rather than replaces, sample-based Schrödinger bridge estimation. Pavon–Tabak–Trigila formulated a data-driven bridge from empirical marginals using Fortet–Sinkhorn-style iterations, constrained maximum likelihood estimation, and importance sampling, specifically to avoid grid discretization in high dimension (Pavon et al., 2018). Belomestny–Naumov–Puchkin–Suchkov later rewrote the Schrödinger system in terms of a single transformed potential satisfying a nonlinear fixed-point equation, learned by empirical risk minimization; under sub-Gaussian, Lipschitz, boundedness, and function-class assumptions, the paper establishes uniform concentration of empirical risk around population risk and near-parametric rates E(x)E(x)10 up to logarithmic factors when the bracketing entropy scales like E(x)E(x)11 (Belomestny et al., 9 Feb 2026).

Generalized Schrödinger Bridge Matching extends the bridge objective beyond kinetic energy to task-specific state costs and soft KL penalties. In its data-to-energy recipe, one sets E(x)E(x)12, introduces the objective

E(x)E(x)13

uses conditional stochastic optimal control with Gaussian path parameterizations, and in practice precomputes target samples from the unnormalized energy by short-run Langevin MCMC before explicit flow matching. This suggests a methodological division between terminal-energy training that remains sample-free on the target side and training that converts the energy into an auxiliary sample pool (Liu et al., 2023).

Several recurrent misconceptions are explicitly challenged in this line of work. One is that Schrödinger bridge training requires samples from both marginals; the data-to-energy and data-free IPF formulations were introduced precisely for the case in which the target marginal is available only through an unnormalized density (Tamogashev et al., 30 Sep 2025). A second is that bridge learning with an energy-defined endpoint is essentially the same as standard score-based diffusion; ASBM instead attributes curvature and noisy score targets to the memoryless forward process and replaces them with non-memoryless bridges (Shin et al., 17 Feb 2026). A third is that matching-based solvers must rely on importance weighting or repeated plan-refinement cycles; ASBS replaces these with simple matching objectives and on-policy samples, while LightSB-M proves exact recovery from a single optimal bridge-matching step under its assumptions (Liu et al., 27 Jun 2025, Gushchin et al., 2024).

Taken together, these works define data-to-energy Schrödinger bridge training as a technically specific regime: endpoint information is asymmetric, the terminal marginal is represented by an energy rather than a dataset, and training is organized around stochastic optimal control, Schrödinger potentials, bridge matching, or IPF-like alternation rather than direct score supervision from target samples. The resulting methods occupy a junction between diffusion modeling, entropic optimal transport, and energy-based learning, with the main design choices centered on endpoint access, coupling construction, memorylessness, and whether the target energy is used directly or first converted into samples.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Data-to-Energy Schrödinger Bridge Training.