Papers

Topics

Authors

Recent

View all

Gemini 2.5 Flash

173 tokens/sec

GPT-4o

7 tokens/sec

Gemini 2.5 Pro Pro

46 tokens/sec

o3 Pro

4 tokens/sec

GPT-4.1 Pro

38 tokens/sec

DeepSeek R1 via Azure Pro

28 tokens/sec

2000 character limit reached

Forward-Reverse Monte Carlo for SBP

Updated 2 July 2025

Forward-Reverse Iterative Monte Carlo Procedure is a kernel-based, sample-driven algorithm that estimates positive Schrödinger boundary potentials for solving the Schrödinger Bridge Problem.
It uses forward and reverse simulations with nonparametric kernel regression to update potentials iteratively while ensuring contractivity and geometric convergence.
The method achieves minimax-optimal convergence rates and is applicable in high-dimensional settings, proving valuable for stochastic control, data interpolation, and machine learning.

The forward-reverse iterative Monte Carlo procedure for the Schrödinger Bridge Problem (SBP) is a kernel-based, sample-driven algorithm designed to solve general entropic optimal transport problems between prescribed initial and final distributions. This method addresses the construction of Schrödinger potentials—strictly positive boundary functions needed to specify the SBP marginals—by iteratively updating and estimating them through simulation and nonparametric regression. The procedure is contractive in the Hilbert projective metric, preserves positivity, and achieves minimax-optimal convergence rates for the potentials, facilitating efficient estimation of distributions associated with the SBP in both theory and practice.

1. Foundations: The Schrödinger Bridge Problem in Entropic Optimal Transport

The SBP seeks the most likely evolution—minimizing the relative entropy with respect to a reference Markov process—between two endpoint (marginal) densities, $\rho_0(x)$ and $\rho_T(z)$ , on $\mathbb{R}^d$ over a fixed time interval $[0, T]$ , given a reference process with transition density $q(0, x; T, z)$ . This problem is central in entropic optimal transport and is a continuous analog of Sinkhorn's algorithm: the joint initial/terminal law must have marginals matching $\rho_0$ and $\rho_T$ but otherwise remain as close as possible (in KL divergence) to the law of the reference process.

The solution is characterized by two positive potential functions (the so-called Schrödinger potentials or boundary potentials) $\nu_0(x)$ and $\nu_T(z)$ , with the joint law

$\mu(dx,dz) = q(0, x; T, z)\, \nu_0(x)\, \nu_T(z)\, dx\, dz$

satisfying the marginal constraints: $\int q(0, x; T, z)\, \nu_0(x) dx = \rho_T(z), \quad \int q(0, x; T, z)\, \nu_T(z) dz = \rho_0(x).$ The existence and uniqueness (up to scaling) of these potentials are crucial for well-posedness of the SBP.

2. Picard Iteration as a Fixed-Point Problem

Recovering the Schrödinger potentials involves solving a nonlinear integral system, typically via Picard iteration. Let $g^\star$ be a positive function on the support of $\rho_T$ . Define the nonlinear operator $\mathcal{C}$ by

$\mathcal{C}[g](z) = \int_{\mathsf{S}_0} \frac{\rho_0(x)}{\int_{\mathsf{S}_T} q(0, x; T, z') \frac{\rho_T(z')}{g(z')} dz'} q(0, x; T, z)\, dx,$

and seek a fixed point $g^\star = \mathcal{C}[g^\star].$ Once $g^\star$ is found, the potentials are recovered as $\nu_T = \rho_T / g^\star$ and $\nu_0(x) = \rho_0(x)/\int_{\mathsf{S}_T} q(0, x; T, z)\, \nu_T(z)\,dz.$

This framework, an extension of Sinkhorn's scaling, generalizes to arbitrary reference processes provided forward and reverse simulation is feasible.

3. Forward-Reverse Iterative Monte Carlo and Kernel Regression

The core of the algorithm is a forward-reverse Monte Carlo implementation of the Picard iteration. At each step:

Forward update: For a given $x$ , estimate $\mathbb{E}[\nu_T^{(n)}(X_T^x)]$ by simulating the reference process $N$ times starting from $x$ and using kernel regression:

$\widehat{g}(x) = \frac{\sum_{i=1}^N K\left(\frac{x - X_0^{(i)}}{\delta}\right)\, \nu_T^{(n)}(X_T^{(i)})}{\sum_{i=1}^N K\left(\frac{x - X_0^{(i)}}{\delta}\right)},$

where $\{X_0^{(i)}, X_T^{(i)}\}$ are i.i.d. reference samples and $K$ is a kernel function.

Reverse update: For $z$ , simulate the time-reversed process and estimate $\mathbb{E}[\nu_0^{(n+1)}(Y_T^z) \mathcal{Y}_T^z]$ nonparametrically:

$\widehat{h}(z) = \frac{\sum_{j=1}^M K\left(\frac{z - Y_0^{(j)}}{\delta}\right)\, \nu_0^{(n+1)}(Y_T^{(j)})\, \mathcal{Y}_T^{(j)}}{\sum_{j=1}^M K\left(\frac{z - Y_0^{(j)}}{\delta}\right) \mathcal{Y}_T^{(j)}},$

where $Y^z$ is the reverse process and $\mathcal{Y}^z$ is a likelihood weight.

Iterate: Update the potentials using forward and reverse regression estimates. Iteration continues until potentials converge in an appropriate sense.

Kernel-based regression is chosen for its nonparametric flexibility, which is essential when the reference process or target distributions are complex and only sample access is available.

4. Positivity, Contractivity, and Theoretical Guarantees

The proposed scheme preserves positivity at each iteration and is contractive with respect to the Hilbert projective metric: $d_H(f,g) = \log\frac{\sup f/g}{\inf f/g}.$ By Birkhoff's theorem, the (truncated) kernel regression operator inherits the strict contractivity of the underlying exact fixed point map. Thus, the iteration converges geometrically to a unique positive fixed point (modulo scaling), even in the presence of Monte Carlo noise.

For the kernel-based MC estimator, minimax-optimal convergence rates are proven: for sample size $N$ and underlying function smoothness $\alpha$ in dimension $d$ ,

$\mathbb{E}\left[ d_H(\widehat{g}_k, g^\star) \right] \lesssim N^{- \frac{1+\alpha}{2(1+\alpha) + d} }$

after $k \gtrsim \log N$ iterations, which is theoretically sharp for nonparametric regression under these conditions.

5. Forward-Reverse Simulation and Non-nested Monte Carlo Estimation

Once the potentials are learned, the SBP Markov process can be simulated by adjusting the drift of the reference process,

$d X_t = \left( b(X_t, t) + \sigma(X_t, t)\sigma(X_t, t)^\top \nabla \log h(X_t, t) \right) d t + \sigma(X_t, t) d W_t,$

with $h(w, t) = \int q(t,w;T,y)\, \nu_T(y)\,dy$ . However, approximation error in the potentials induces errors in trajectory law, particularly near the boundaries of the time interval.

To avoid nested evaluation and numerical instability in high dimensions or near the time endpoints, a non-nested Monte Carlo procedure for finite-dimensional SBP statistics is introduced. The procedure involves:

Sampling endpoint pairs i.i.d. from the learned $\nu_0$ and $\nu_T$ ,
Simulating forward and backward trajectories from these endpoints,
Estimating joint expectations via a forward-reverse matching scheme using the associated kernel weights and likelihood corrections.

This approach efficiently estimates path functionals and marginal distributions of the SBP process without explicitly solving the SDE with numerically unstable learned drifts at every time step.

6. Impact and Applications

The forward-reverse iterative Monte Carlo method for SBP, as developed in the paper, expands the reach of entropic optimal transport to general reference processes and high-dimensional marginal distributions, without requiring explicit transition densities. It further yields theoretical assurance of minimax optimality for the potential estimates, and contractivity that drives robust convergence. Applications extend from stochastic control and data interpolation to machine learning problems where high-dimensional transport or likelihood interpolation is needed, and it provides a flexible, scalable framework for nonparametric entropic transport estimation using only simulation access to the dynamics.

7. Key Formulas and Algorithmic Summary

Concept	Key Formula/Description
Schrödinger potentials	$\rho_0(x) = \nu_0(x) \int q(0, x; T, z)\,\nu_T(z)\,dz$ ; $\rho_T(z) = \nu_T(z) \int q(0, x; T, z)\,\nu_0(x)\,dx$
Picard iteration	$g^\star = \mathcal{C}[g^\star]$ (fixed point); see explicit integral operator above
Kernel regression step	See formulas for $\widehat{g}(x)$ , $\widehat{h}(z)$ above
Non-nested estimator	Uses independent forward and reverse process samples from $\nu_0$ and $\nu_T$ for path-functional MC
Convergence rate	$d_H(\widehat{g}_k, g^\star) \lesssim N^{- \frac{1+\alpha}{2(1+\alpha) + d} }$ after $\mathcal{O}(\log N)$ iterations

This systematic forward-reverse approach, combining kernel regression and iterative Monte Carlo, addresses both theoretical optimality and practical computation for general SBP and entropic optimal transport settings.

PDF Markdown Chat (Upgrade)