Sinkhorn Bridge: Entropic OT & Stochastic Flows

Updated 28 October 2025

Sinkhorn Bridge is a framework that interpolates between probability distributions using entropic optimal transport and the Schrödinger bridge formulation.
It employs iterative Sinkhorn scaling and log-sum-exp techniques to enforce marginal constraints efficiently, even in high-dimensional or sample-based settings.
The approach guarantees robust convergence and has broad applications in generative modeling, optimal control, and large-scale data science.

A Sinkhorn Bridge is a computational and conceptual framework that connects entropic optimal transport (EOT), the Schrödinger bridge problem, and scalable inference or learning of stochastic flows between probability distributions. It refers to both the mathematical object that interpolates between endpoint distributions under an entropic regularization, and to a family of iterative and plug-in algorithms (Sinkhorn-type) that construct such interpolations efficiently in high-dimensional or sample-based settings.

1. Mathematical Foundations of the Sinkhorn Bridge

The Sinkhorn Bridge formalism originates from the equivalence between the Schrödinger bridge problem—finding the most likely stochastic process (in relative entropy) connecting given endpoint distributions—and entropy-regularized optimal transport (EOT). Consider probability measures $\mu, \nu$ on a measurable space $\mathcal{X}$ , a reference process (e.g., Brownian motion with noise parameter $\varepsilon$ ), and cost $C(x, y)$ : $\mathrm{OT}_\varepsilon(\mu, \nu) = \min_{\pi \in \Pi(\mu, \nu)} \int_{\mathcal{X}^2} C(x, y) \,d\pi(x, y) + \varepsilon\, \mathrm{KL}(\pi \,\|\, \mu\otimes\nu)$ The entropic regularization renders the problem strictly convex and smooth. The Sinkhorn divergence is the "de-biased" version: $\mathcal{S}_\varepsilon(\mu, \nu) = \mathrm{OT}_\varepsilon(\mu, \nu) - \frac{1}{2}\mathrm{OT}_\varepsilon(\mu, \mu) - \frac{1}{2}\mathrm{OT}_\varepsilon(\nu, \nu)$ which metrizes weak convergence, ensures positivity and convexity, and generalizes both optimal transport and MMD as $\varepsilon \to 0$ and $\varepsilon \to \infty$ , respectively (Feydy et al., 2018).

A prototypical example is the dynamic Schrödinger bridge, which seeks a law $\mathbf{P}$ on path space minimizing $\mathrm{KL}(\mathbf{P} \,\|\, \mathbf{R})$ (with $\mathbf{R}$ the law of the reference SDE), under prescribed endpoint marginals $\mathbf{P}_0 = \mu$ , $\mathbf{P}_1 = \nu$ . This solution can be written using scaling functions (Schrödinger factors) obeying a nonlinear integral system structurally identical to that of Sinkhorn scaling in the discrete case (Chen et al., 2020, Essid et al., 2018, Marino et al., 2019).

2. Sinkhorn Iterations: Algorithmic Structure and Generalized Schemes

The computational embodiment of the Sinkhorn Bridge is iterative proportional fitting (the Sinkhorn algorithm), which alternately enforces the required marginals on an entropic kernel coupling. For discrete distributions, given cost matrix $C_{ij}$ and kernel $K_{ij} = \exp(-C_{ij}/\varepsilon)$ , the scaling vectors $u, v$ are updated as: $u^{(\ell+1)} = \frac{p}{Kv^{(\ell)}},\quad v^{(\ell+1)} = \frac{q}{K^\top u^{(\ell+1)}}$ and the transport plan converges to $P^* = \operatorname{diag}(u^*) K \operatorname{diag}(v^*)$ (Feydy et al., 2018).

In the continuous and path-space setting (Schrödinger bridge), the Sinkhorn (Fortet) algorithm consists of recursive updates of scaling functions that enforce marginal constraints on the joint coupling generated by the reference kernel (typically, the heat kernel or more general Markov transitions) (Essid et al., 2018).

Efficiency is enabled by log-sum-exp reductions, GPU-accelerated tensor routines, and on-the-fly kernel aggregation (e.g., KeOps), which avoid explicit materialization of large matrices, rendering the method linear in memory and applicable to millions of samples (Feydy et al., 2018).

Data-Driven and Stochastic Generalizations

Sample-based Sinkhorn Bridges are constructed iteratively using only samples (point clouds) from the marginals, replacing integrals with empirical averages and using constrained maximum likelihood and importance sampling (Pavon et al., 2018). This methodology underpins the entire class of data-driven SB solvers, including neural Sinkhorn bridges for learned drifts and diffusions (Nodozi et al., 2023).

The plug-in estimator variant (Sinkhorn bridge estimator) proceeds by solving the static entropic OT problem (via Sinkhorn scaling), extracting the dual potentials, and using an explicit formula for the time-dependent interpolating drift

$\hat{b}_t(z) = (1-t)^{-1} \left(-z + \frac{\sum_j Y_j \exp(\hat{g}_j - \tfrac{1}{2(1-t)\varepsilon}\|z-Y_j\|^2)}{\sum_j \exp(\hat{g}_j - \tfrac{1}{2(1-t)\varepsilon}\|z-Y_j\|^2)}\right)$

thus sidestepping all iterative SDE fitting and neural parameterization (Pooladian et al., 21 Aug 2024).

3. Theoretical Guarantees: Positivity, Convexity, and Convergence

Sinkhorn divergences inherit a robust theoretical foundation:

Positivity and convexity: $\mathcal{S}_\varepsilon(\alpha, \beta) \geq 0$ , with equality iff $\alpha = \beta$ .
Metrization: $\mathcal{S}_\varepsilon(\alpha_n, \alpha) \rightarrow 0$ iff $\alpha_n \rightharpoonup \alpha$ .
Strict convexity and uniqueness follow from the geometric entropy (Sinkhorn negentropy) structure, providing a unique minimizer (Feydy et al., 2018).

The convergence of Sinkhorn iterates (and even their gradients and Hessians) is exponential in Hilbert's projective metric, persists in high dimensions, and is controlled by explicit contraction coefficients that depend on system-theoretic (controllability Gramian) and geometric parameters of support sets (Teter et al., 2023, Greco et al., 2023, Akyildiz et al., 12 Mar 2025).

The Sinkhorn Bridge estimator achieves finite-sample error bounded by $O(1/m + 1/n + r^{4k})$ in total variation, where $m, n$ are sample sizes and $k$ is the number of Sinkhorn steps, with exponential decay in $k$ and dimensionality dependencies in the intrinsic, not ambient, dimension (Maeda et al., 26 Oct 2025, Pooladian et al., 21 Aug 2024).

For log-concave target measures and general reference distributions, exponential contraction and stability are rigorously established using Riccati difference equations and transportation cost inequalities, ensuring robust convergence properties for the entire class of models spanning applied machine learning (Moral, 20 Mar 2025).

4. Algorithmic Implementation and Computational Strategies

Implementing the Sinkhorn Bridge involves several critical components:

Discrete/sampled measure representation: Supports arbitrary distributions as Dirac mixtures; enables empirical computation.
Alternating minimization: Sinkhorn updates via block coordinate ascent, with log-sum-exp for speed and stability.
Gradient computation: Uses fixed-point differentiation, avoiding explicit backpropagation through all iterations; 2–3× runtime acceleration is typical (Feydy et al., 2018).
Large-scale GPU acceleration: The KeOps library provides symbolic kernel reductions, linearizing memory cost even for $N, M \sim 10^6$ (Feydy et al., 2018).

For pathwise interpolants or SDE-driven applications, the computed dual potentials from a single Sinkhorn call translate via a plug-in expression to time-dependent drifts for SDE simulation, with sampling at arbitrary time/supplied points possible at essentially no additional cost (Pooladian et al., 21 Aug 2024, Maeda et al., 26 Oct 2025).

5. Applications Across Machine Learning, Optimal Control, and Data Science

The Sinkhorn Bridge framework supports a broad spectrum of applications:

Generative modeling: Sinkhorn divergences are deployed as loss functions in high-dimensional generative neural architectures (GANs, diffusion models), supporting robust learning on data concentrated on manifolds or with singular support (Genevay et al., 2017).
Entropy-regularized optimal transport: Provides tractable and scalable computation in imaging, shape analysis, and text modeling.
Schrödinger bridge samplers: Iterative Sinkhorn-based policy refinement for sampling from complex posteriors, estimation of normalizing constants, and non-equilibrium physics; Sequential bridge sampling improves variance and scalability (Bernton et al., 2019).
Stochastic optimal control: Sinkhorn Bridges characterize minimum entropy (energy) solutions to steering problems for diffusions and controlled Markov processes, even with complex endpoint constraints or termination (absorbing boundaries) (Eldesoukey et al., 11 Apr 2024).
Unification of existing methods: Modern Schrödinger bridge solvers and generative samplers (e.g., [SF] $^2$ M, DSBM-IMF, LightSB(-M)) are special cases of—or theoretically justified by—the Sinkhorn Bridge error bounds and statistical convergence theory (Maeda et al., 26 Oct 2025).

6. Limitations and Active Research Directions

While the classical Sinkhorn Bridge has well-understood contraction properties and robust convergence under entropic regularization, extension to control-affine SDEs without channel-matching conditions leads to coupled, nonlinear PDEs for which the classic Sinkhorn recursion does not apply; developing generalized algorithms in these regimes remains open (Teter et al., 22 Mar 2025).

Further directions include optimal preconditioning to accelerate convergence, extension to infinite-marginal and dynamic constraints, and integration with neural stochastic flow architectures when endpoint densities are intractable or accessed only via simulation (Nodozi et al., 2023).

Table: Summary of Sinkhorn Bridge Key Features

Aspect	Sinkhorn Bridge Feature	References
Divergence Type	Entropic-regularized OT (interpolates OT and MMD)	(Feydy et al., 2018, Genevay et al., 2017)
Algorithmic Core	Alternating scaling/Sinkhorn iteration (block coordinate ascent, log-sum-exp)	(Feydy et al., 2018, Essid et al., 2018)
Statistical Guarantee	$O(1/m + 1/n + r^{4k})$ error in empirical bridge	(Maeda et al., 26 Oct 2025, Pooladian et al., 21 Aug 2024)
Scalability	Linear memory/compute via KeOps, batched GPU execution	(Feydy et al., 2018)
Application Domains	ML (generative, data-driven control), OT, stochastic control	(Chen et al., 2020, Bernton et al., 2019)

In summary, the Sinkhorn Bridge constitutes a rigorous, unified framework for regularized stochastic interpolation, combining computational tractability, statistical guarantees, and versatility across machine learning, control, and data science.