Papers
Topics
Authors
Recent
2000 character limit reached

Conditional Föllmer Flow

Updated 22 November 2025
  • Conditional Föllmer flow is a framework that maps a base distribution to a prescribed conditional distribution using continuous-time ODE and SDE formulations.
  • The approach integrates variational principles, entropic optimal transport, and Schrödinger bridge problems, ensuring unbiased conditional sampling and strong convergence guarantees.
  • Empirical results demonstrate its efficacy in high-dimensional settings like MNIST synthesis and dynamical system forecasting, achieving state-of-the-art performance.

A conditional Föllmer flow is a family of probability measures, or stochastic processes, that provide an explicit, finite-time, continuous-time mapping from a simple base distribution—often a Dirac or standard normal—onto a prescribed conditional distribution. Unlike classical transport or diffusion approaches that target unconditional distributions, the conditional Föllmer flow framework is explicitly designed to model, sample, or learn conditional distributions, with particular impact in Bayesian inference, conditional generative modeling, and mean-field stochastic analysis. Multiple formulations exist—involving stochastic differential equations (SDE), ordinary differential equations (ODE), and flows on measure spaces—all unified by their connection to Föllmer’s original concept of path-space entropy minimization and the Schrödinger bridge problem.

1. Mathematical Formulation and Key Structures

The conditional Föllmer flow arises in several rigorous settings. In core instances, let (X1,Y)(X_1, Y) be a pair of random variables on Rd×RdY\mathbb{R}^d \times \mathbb{R}^{d_Y} with joint density π1(x,y)\pi_1(x, y), and denote π1(y)\pi_1(\cdot | y) as the conditional density of interest. Introduce a base distribution X0N(0,Id)X_0 \sim \mathcal{N}(0,I_d), independent of YY, and define a mixture variable

Xt=tX1+1t2X0,t[0,1].X_t = t X_1 + \sqrt{1-t^2} X_0, \qquad t \in [0,1].

Denote πt(xy)=Law(XtY=y)\pi_t(x|y) = \text{Law}(X_t|Y = y) and s(x,y,t)=xlogπt(xy)s(x, y, t) = \nabla_x \log \pi_t(x|y) the conditional score. The conditional Föllmer flow is the ODE for Z(t,y)Z(t, y): dZ(t,y)dt=vF(Z(t,y),y,t),Z(0,y)N(0,Id),\frac{d Z(t, y)}{dt} = v_F(Z(t, y), y, t), \qquad Z(0, y) \sim \mathcal{N}(0,I_d), with the Föllmer velocity field

vF(x,y,t)=x+s(x,y,t)t,t>0,v_F(x, y, t) = \frac{x + s(x, y, t)}{t}, \qquad t > 0,

and vF(x,y,0)=E[X1Y=y]v_F(x, y, 0) = \mathbb{E}[X_1 | Y = y] (Chang et al., 2 Feb 2024). This flow pushes forward the base measure to the target conditional, i.e., F1(Z0,y)π1(y)F_1(Z_0, y) \sim \pi_1(\cdot|y).

A parallel construction in the SDE setting considers the problem of driving a system from an initial deterministic (or prior) measure to a conditional target in finite time,

dθt=u(t,θt)dt+γdWt,θ0=0,d\theta_t = u^*(t, \theta_t)dt + \sqrt{\gamma} dW_t, \qquad \theta_0=0,

where the optimal drift uu^*—the Föllmer drift—solves a path-space entropy minimization problem (the conditional Schrödinger bridge) (Vargas et al., 2021).

2. Theoretical Foundations: Variational Characterization and Entropic Optimal Transport

The conditional Föllmer flow is intimately connected to the conditional Schrödinger bridge problem. Given a reference stochastic process (typically uncontrolled Brownian or Wiener measure), the objective is to construct a path-space law PP^* whose initial and terminal marginals match (respectively) the base and the desired conditional distribution, minimizing the Kullback-Leibler divergence to the reference: P=argminP:P0=δx0,P1=π1(y)KL(PWγ)P^* = \operatorname{argmin}_{P : P_0 = \delta_{x_0},\, P_1 = \pi_1(\cdot|y)} \mathrm{KL}(P \| W^\gamma) (Vargas et al., 2021, Chen et al., 20 Mar 2024). The existence and uniqueness of such a projection is guaranteed under mild tail and regularity conditions. The drift of the resulting SDE or ODE has a closed-form, variational, or regression-based characterization, and links to stochastic control and Hamilton–Jacobi–Bellman equations.

The continuous-time control objective, such as

J(u)=EPu[12γ01u(t,θt)2dtlnp(Dθ1)p(θ1)(2πγ)d/2exp(θ12/2γ)],J(u) = \mathbb{E}_{P^{u}}\left[\frac{1}{2\gamma}\int_{0}^{1}\|u(t, \theta_{t})\|^{2}\,dt - \ln\frac{p(D | \theta_{1}) p(\theta_{1})}{(2\pi\gamma)^{-d/2}\exp(-\|\theta_{1}\|^{2}/2\gamma)}\right],

with uu running over Markov controls and PuP^u the induced path law, achieves the minimum when uu is the Föllmer drift (Vargas et al., 2021). Analogous variational principles arise for conditional SDE interpolants with time-dependent coefficients (Chen et al., 20 Mar 2024).

3. Neural Parameterization, Training Objectives, and Implementation

All practical methods parameterize the drift (or velocity) field vFv_F (ODE) or uu^* (SDE) via neural networks. For the ODE model (Chang et al., 2 Feb 2024), the velocity network vθ(x,y,t)v_\theta(x, y, t) is taken from a hypothesis class of ReLU feed-forward networks with controlled depth, width, norm, and Lipschitz constants; the parameterization directly absorbs tt and yy as additional input features.

The learning objective is the population quadratic loss

L(v)=1T0TEX0,(X1,Y)[X1t1t2X0v(Xt,Y,t)2]dt,\mathcal{L}(v) = \frac{1}{T} \int_0^T \mathbb{E}_{X_0,(X_1,Y)}\left[\| X_1 - \frac{t}{\sqrt{1-t^2}} X_0 - v(X_t, Y, t) \|^2\right] dt,

minimized by vFv_F. Empirical risk minimization is performed on data drawn from (X1,Y)(X_1, Y) and (t,X0)(t, X_0), using stochastic gradient descent over the network parameters.

For SDE-based flows (Vargas et al., 2021, Chen et al., 20 Mar 2024), the loss combines path-space quadratic control cost and (negative) log-likelihood matching, and is typically discretized via the Euler–Maruyama scheme. A Monte Carlo estimate with minibatched data is used for computational tractability. Notably, discretization, estimation, and time-truncation errors are all rigorously analyzed and bounded.

Network architectures in experiments use depths $4$–$6$, widths $128$–$512$, batch normalization, and appropriate activation functions (Softplus or ReLU) (Chang et al., 2 Feb 2024, Vargas et al., 2021).

4. Discretization, Sampling, and Algorithmic Details

Numerical implementation of the conditional Föllmer flow follows time discretization (Euler or Euler–Maruyama with KK steps), in either the ODE or SDE variants:

  • ODE: zk+1=zk+hkvθ(zk,y,tk)z_{k+1} = z_k + h_k v_\theta(z_k, y, t_k)
  • SDE: θj+1=θj+uϕ(tj,θj)Δt+γΔtξj\theta_{j+1} = \theta_j + u_\phi(t_j, \theta_j)\Delta t + \sqrt{\gamma\,\Delta t}\,\xi_j, ξjN(0,I)\xi_j\sim\mathcal{N}(0, I)

For conditional generation, the process is initialized with a sample from the base (typically standard Gaussian), then forward-mapped to approximate a sample from the desired conditional law. Both frameworks support batch autoregressive sampling. Implementation pseudocode and full algorithmic details—including batch size, learning rate, optimizer, and data selection—are provided in the respective references (Chang et al., 2 Feb 2024, Vargas et al., 2021, Chen et al., 20 Mar 2024).

Empirical results demonstrate that the approach efficiently generates conditional samples for both low-dimensional simulation tasks and high-dimensional data (e.g., MNIST class-conditional synthesis, image inpainting, probabilistic forecasting of dynamical systems) (Chang et al., 2 Feb 2024, Chen et al., 20 Mar 2024).

5. Theoretical Guarantees and Convergence Analysis

The conditional Föllmer flow framework is accompanied by strong theoretical guarantees. For ODE-based models (Chang et al., 2 Feb 2024):

  • Under boundedness and regularity (Assumptions A1–A4), the flow FtF_t and velocity vFv_F are Lipschitz.
  • The main end-to-end result (Theorem 5.1) provides that, with nn samples and appropriate parameterization, the expected W22W_2^2 error in law satisfies

Eyπ(y)[W22(z^N(y),XY=y)]=O~(n4/[9(d+dY+5)])\mathbb{E}_{y \sim \pi(y)} [ W_2^2( \hat{z}_N(y), X|Y=y ) ] = \tilde{\mathcal{O}}\left(n^{-4/[9(d+d_Y+5)]}\right)

with high probability.

Key error sources—estimation, discretization, and time truncation—are analyzed separately, supporting joint optimization of network, data, and discretization parameters (Chang et al., 2 Feb 2024).

For SDE formulations (Vargas et al., 2021), expressive universality (Theorem 4 of Tzen & Raginsky) guarantees KL(π1π1ϕ)ϵ\mathrm{KL}(\pi_1 \| \pi_1^\phi) \leq \epsilon for any ϵ>0\epsilon>0 with polynomial-sized networks. Euler–Maruyama discretization achieves second-order accuracy (Corollary 2.2), and low-variance gradient estimators can be constructed (the “sticking the landing” estimator), with vanishing variance at the optimum.

The non-singularity, Lipschitz property, and unbiased terminal law enforcement are established for stochastic interpolant-based SDEs, with explicit computation of optimal diffusion schedules gF(t)g^F(t) for minimizing path-space divergence (Chen et al., 20 Mar 2024).

6. Flows of Conditional Measures and Itô Calculus on Probability Spaces

Conditional Föllmer flows also appear as flows of conditional probability measures on general semimartingales, as rigorously developed in the context of stochastic analysis (Guo et al., 17 Apr 2024). Let (Xt)t[0,T](X_t)_{t \in [0,T]} be a càdlàg semimartingale adapted to a filtration F\mathbb{F}, with an auxiliary “common-noise” filtration F0F\mathbb{F}^0 \subset \mathbb{F}. The flow of conditional measures

μt=Law(XtFt0)\mu_t = \operatorname{Law}(X_t | \mathcal{F}^0_t )

provides a path-valued random process in the Wasserstein space of probability measures.

A major technical contribution is the Itô formula for functionals of the flow U(μ)U(\mu), extended from classical functionals via construction of conditional independent copies. This development unifies and extends Föllmer’s deterministic flow results and the Lions–Cardaliaguet calculus for McKean–Vlasov equations. The formula handles general mean-field systems, common-noise uncertainty, and semimartingale jumps (Guo et al., 17 Apr 2024).

7. Empirical Performance and Applications

Conditional Föllmer flows have achieved state-of-the-art performance in conditional density estimation, probabilistic forecasting, and conditional generation. Notable empirical results include:

  • Simulation studies with multimodal and heteroskedastic densities: the flow achieves the lowest MSE for mean and standard deviation estimates compared to kernel and FlexCode methods (Chang et al., 2 Feb 2024).
  • Conditional generation in high-dimensional settings: MNIST class-conditional synthesis and inpainting tasks yield high-fidelity and diverse samples, outperforming GAN and SDE-based baselines (Chang et al., 2 Feb 2024).
  • Probabilistic forecasting in dynamical systems (e.g., Navier–Stokes, video prediction): the approach generates accurate, unbiased conditional ensembles of future states (Chen et al., 20 Mar 2024).
  • In all setups, the conditional Föllmer flow demonstrates accurate density modeling, strong coverage, and stable training regimes.

These features, along with rigorous analysis, establish the conditional Föllmer flow as a foundational tool for conditional distribution learning and sampling in modern probabilistic machine learning, Bayesian inference, and stochastic modeling (Vargas et al., 2021, Chang et al., 2 Feb 2024, Chen et al., 20 Mar 2024, Guo et al., 17 Apr 2024).

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Conditional Föllmer Flow.