Papers
Topics
Authors
Recent
Search
2000 character limit reached

Image-to-Image Schrödinger Bridge (I²SB)

Updated 2 February 2026
  • Image-to-Image Schrödinger Bridge (I²SB) is a generative modeling framework that interpolates between image distributions using entropy-regularized optimal transport.
  • The approach extends classical diffusion methods by replacing fixed Gaussian priors with arbitrary marginals and jointly optimizing drift and volatility via the Schrödinger–Bass Bridge (SBB).
  • LightSBB-M leverages closed-form updates and efficient training loops to achieve state-of-the-art performance in synthetic benchmarks and real-world FFHQ image translations.

Image-to-Image Schrödinger Bridge (I²SB) is a class of generative modeling frameworks that learns stochastic dynamics interpolating between two arbitrary data distributions, typically two types of images. I²SB extends classical score-based diffusion by replacing the fixed Gaussian prior with arbitrary marginals, enabling direct domain translation, restoration, and enhancement tasks within a tractable entropy-regularized optimal transport setting. Recent work has further generalized the Schrödinger Bridge by introducing the Schrödinger–Bass Bridge (SBB), which jointly optimizes both drift and volatility components, providing an interpolation between pure drift-driven and pure martingale stochastic transport (Alouadi et al., 27 Jan 2026).

1. Mathematical Foundations of SBB and I²SB

Let μ0,μT\mu_0,\mu_T be distributions over Rd\mathbb{R}^d encoding source and target (e.g., adult and child faces in FFHQ latent space). The SBB seeks a path measure P\mathcal{P} over trajectories XtX_t driven by both drift αt\alpha_t and volatility σt\sigma_t: dXt=αt(Xt)dt+σt(Xt)dWtX0μ0, XTμTdX_t = \alpha_t(X_t)\,dt + \sigma_t(X_t)\,dW_t \quad X_0 \sim \mu_0,\ X_T \sim \mu_T The joint cost combines drift and volatility penalties: J(P)=EP[0T(αt2+βσtϵIF2)dt]J(\mathcal{P}) = \mathbb{E}_{\mathcal{P}}\left[ \int_0^T \left( \|\alpha_t\|^2 + \beta \|\sigma_t - \sqrt{\epsilon}I\|^2_F \right) dt \right] where β>0\beta>0 tunes the drift-volatility trade-off. In the dual, with Lagrange multiplier and PDE arguments, optimizing this cost reduces to solving: supv,ψ{ψ(x)μT(dx)v(0,x)μ0(dx)}\sup_{v,\psi} \left\{ \int \psi(x) \mu_T(dx) - \int v(0,x)\mu_0(dx) \right\} subject to tv+Hβ(v,D2v)=0, v(T,)=ψ\partial_tv + H^*_\beta(\nabla v, D^2v) = 0,\ v(T,\cdot) = \psi, with the Fenchel–Legendre transform Hβ(p,q)=12p2+12ϵβTr[(Iq/β)1I]H^*_\beta(p,q)=\frac{1}{2}|p|^2 + \frac{1}{2}\epsilon\beta\,\operatorname{Tr}[(I - q/\beta)^{-1} - I] for q<D2v<βIq < D^2v < \beta I (Alouadi et al., 27 Jan 2026).

2. Closed-form Drift and Volatility, Role of β\beta

The dual maximizer v(t,x)v^*(t,x) yields analytic feedback laws:

  • Drift μt(x)=xv(t,x)\mu_t^*(x)=\nabla_x v^*(t,x).
  • Volatility σt(x)=ϵ(I(1/β)Dx2v(t,x))1\sigma_t^*(x)=\sqrt{\epsilon}(I - (1/\beta)D^2_xv^*(t,x))^{-1}.

Key interpolation regimes:

  • As β\beta\to\infty: volatility freezes to σtϵI\sigma_t\approx\sqrt{\epsilon}I and the process reduces to classical Schrödinger bridge (deterministic drift).
  • As β0\beta\to 0: drift cost diverges, μt0\mu_t\to 0, yielding a pure Bass martingale transport with volatility-only control.

Equivalently, the transport can be recast using a Schrödinger potential hTh_T^* and a stretching map Φt(y)=y2/2+(ϵ/β)loght(y)\Phi_t(y)=|y|^2/2 + (\epsilon/\beta)\log h_t(y) whose Hessian impacts local volatility.

3. LightSBB-M Algorithmic Workflow

The LightSBB-M algorithm computes the SBB transport plan by operating in the nonlinear map space YXY \leftrightarrow X. The workflow alternates between:

  • Learning the score-drift sθ(t,y)ϵyloght(y)s_\theta(t,y) \approx \epsilon \nabla_y \log h_t(y) using bridge-matching loss.
  • Learning the transport map Zθ~(t,x)YZ_{\tilde{\theta}}(t,x) \approx Y via regression.

A typical training loop (Algorithm 1 (Alouadi et al., 27 Jan 2026)):

  1. Initialize map Yt0(x)=xY_t^0(x)=x.
  2. For k=0K1k=0\dots K-1 outer iterations:
    • Sample endpoints x0μ0, xTμTx_0\sim\mu_0,\ x_T\sim\mu_T. Compute y0=Zθ~k(0,x0), yT=Zθ~k(T,xT)y_0=Z_{\tilde{\theta}^k}(0,x_0),\ y_T=Z_{\tilde{\theta}^k}(T,x_T).
    • Sample intermediate yty_t from Brownian bridge between y0y_0 and yTy_T.
    • Update drift network θ\theta via minimizing E[sθ(t,yt)(yTyt)/(Tt)2]\mathbb{E}[\|s_\theta(t,y_t)-(y_T-y_t)/(T-t)\|^2].
    • Update map Zθ~Z_{\tilde{\theta}} by regressing XYX\mapsto Y and matching endpoints at t=0,Tt=0,T.

Closed-form solutions for sθs_\theta and regression are reached in 5\approx 5 outer iterations.

4. Network Architecture and Training: FFHQ Example

For unpaired adult\tochild translation in FFHQ:

  • Domain distributions μ0,μT\mu_0,\mu_T reside in 512-dim ALAE latent space.
  • Drift and map networks are parameterized by deep neural nets (typically, residual architectures).
  • Training proceeds by alternating endpoint and bridge sampling, loss minimization, and network updates.
  • Hyperparameters are set to optimize 2-Wasserstein distance and empirical generative fidelity.

The algorithm is validated on synthetic benchmarks as well as real FFHQ images. It demonstrates substantial improvements (up to 32%) in 2-Wasserstein distance over classical SB and diffusion-based translation baselines.

5. Inference Pipeline and Quantitative Evaluation

Single-pass inference:

  • Given a sample x0x_0 in μ0\mu_0, the learned transport map and drift are applied to realize the optimal SBB trajectory toward μT\mu_T.
  • For image translation, this is decoded through the latent space to obtain final outputs.

Quantitative and qualitative results:

  • LightSBB-M achieves state-of-the-art results in both synthetic mixture transport and real-world image translation (e.g., adult\tochild faces), outperforming prior SB and diffusion approaches.
  • Image outputs exhibit preserved identity components, age-relevant structure, and improved local detail due to joint drift and volatility control.

6. Theoretical and Practical Implications

The SBB formulation generalizes the classical Schrödinger Bridge by enabling direct manipulation of both drift (mean dynamics) and stochastic volatility. The analytic dual permits sample-efficient estimation, fast outer convergence, and interpretable control of generative stochasticity. LightSBB-M operationalizes SBB for high-dimensional, latent-space image translation.

Significance includes:

  • Scalable training, closed-form drift/volatility updates.
  • Control over generative diversity via the β\beta parameter: tuning between deterministic and stochastic transport.
  • Empirical superiority on both synthetic and real image-to-image tasks, validated by benchmark metrics.
  • Compatibility with unpaired and paired data, latent representations, and extensibility to other generative transport settings.

The LightSBB-M framework establishes a computationally efficient, theoretically grounded mechanism for scalable unpaired image-to-image translation via joint drift-volatility optimal transport, bridging practical generative modeling and advanced stochastic control (Alouadi et al., 27 Jan 2026).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Image-to-Image Schrödinger Bridge (I$^2$SB).