Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 148 tok/s
Gemini 2.5 Pro 48 tok/s Pro
GPT-5 Medium 34 tok/s Pro
GPT-5 High 40 tok/s Pro
GPT-4o 101 tok/s Pro
Kimi K2 183 tok/s Pro
GPT OSS 120B 443 tok/s Pro
Claude Sonnet 4.5 35 tok/s Pro
2000 character limit reached

Dynamic Measure Transport (DMT)

Updated 9 November 2025
  • Dynamic Measure Transport (DMT) is a framework that models the evolution of probability measures via PDEs, unifying optimal transport, control theory, and mean-field games.
  • It employs both deterministic and stochastic dynamics with tilted reference paths to overcome teleportation issues, thereby enhancing sample quality.
  • The framework integrates kernel-based numerical methods and Gaussian processes to solve the underlying optimal control problems, with applications in generative modeling and Bayesian inference.

Dynamic Measure Transport (DMT) is a mathematical framework unifying dynamic formulations of measure transport, optimal control, and mean-field games. It generalizes classical optimal transport by modeling the evolution of probability measures subject to partial differential equations (PDEs) in both finite- and infinite-dimensional settings, with significant implications for sampling, generative modeling, and gradient flows. DMT encompasses both traditional “smooth” optimal transport and extensions to spaces characterized only by weak geometric or topological structure, including applications in probability, analysis, stochastic differential equations, and computational statistics.

1. Formal Definition and Mathematical Structures

Let X=RdX = \mathbb{R}^d (or an extended metric-topological space), and fix two Borel probability measures η\eta (reference) and π\pi (target). DMT models the evolution {μt}t[0,1]\{\mu_t\}_{t \in [0,1]} of probability measures such that μ0=η\mu_0 = \eta and μ1π\mu_1 \approx \pi, by either deterministic or stochastic dynamics: dXt=v(Xt,t)dt+σdWt,X0η,dX_t = v(X_t, t)\,dt + \sigma\,dW_t,\quad X_0 \sim \eta, where v:Rd×[0,1]Rdv: \mathbb{R}^d \times [0,1] \to \mathbb{R}^d is a drift field, σ0\sigma \geq 0 is a noise parameter, and WtW_t is standard Brownian motion. The law μt:=Law(Xt)\mu_t := \mathrm{Law}(X_t) evolves according to:

  • The continuity equation (ODE case, σ=0\sigma=0):

tμt+(vtμt)=0,\partial_t \mu_t + \nabla \cdot (v_t\,\mu_t) = 0,

  • The Fokker–Planck equation (SDE case, σ>0\sigma>0):

tμt+(vtμt)=σ22Δμt.\partial_t \mu_t + \nabla \cdot (v_t\,\mu_t) = \frac{\sigma^2}{2}\Delta \mu_t.

In more abstract contexts, DMT is defined on an extended metric-topological measure space (X,T,d,m)(X,\mathcal{T},d,m), where dd may be infinite, and mm is a Radon probability measure. The Cheeger energy EC\mathcal{E}_C generalizes Dirichlet energy and induces a dynamic transport cost (see (Ambrosio et al., 2015)). The DMT/Wasserstein–Cheeger distance WECW_{\mathcal{E}_C} between absolutely continuous measures is given by a Benamou–Brenier-type formula.

2. Reference Paths, Teleportation Phenomena, and Their Limitations

A canonical construction in dynamic measure transport is the geometric annealing (or “annealed”) reference path: μref(x,t)η(x)1tπ(x)t,t[0,1],\mu^{\mathrm{ref}}(x, t) \propto \eta(x)^{1-t} \pi(x)^t,\quad t \in [0,1], with log-density

logμref(x,t)=(1t)logη(x)+tlogπ(x)logZ(t),\log \mu^{\mathrm{ref}}(x, t) = (1-t)\log \eta(x) + t\log \pi(x) - \log Z(t),

where Z(t)Z(t) is the time-dependent normalization. This choice has mathematical convenience—analytic expressions for time-derivatives facilitate density-driven algorithmic approaches.

However, when η\eta and π\pi are multimodal or well-separated, μref\mu^{\mathrm{ref}} exhibits “teleportation.” Most of the mass may abruptly move from one mode to another at a critical tt0t \approx t_0. As a result, transport velocities vv must become large or highly irregular, and density-driven learning (aligning a velocity field to μref\mu^{\mathrm{ref}}) often fails to “split” or move mass correctly. Empirically, this is reflected in significant sample quality deficits or mode-dropping in sampling applications, as evidenced in one-dimensional Gaussian mixture experiments (Section 7 below).

3. Optimal Control and Mean-Field Game Perspective

DMT can be naturally framed as an infinite-dimensional optimal control problem or mean-field game (MFG). For a curve of densities ρt\rho_t and a velocity field vv, the following variational problem encapsulates DMT with action, smoothness, and fidelity costs: minρ,vDKL(ρ(1)π)+01[(1t)DKL(ρ(t)η)+tDKL(ρ(t)π)]dt +01Exρ(t)[12v(x,t)2]dt subject totρ+(ρv)=0,ρ(0)=η\begin{aligned} \min_{\rho,\,v} \quad & D_{\mathrm{KL}}(\rho(1)\|\pi) + \int_0^1 \left[(1-t) D_{\mathrm{KL}}(\rho(t)\|\eta) + t D_{\mathrm{KL}}(\rho(t)\|\pi)\right] dt \ &+ \int_0^1 \mathbb{E}_{x\sim\rho(t)}\left[\frac{1}{2}|v(x,t)|^2\right]dt \ \text{subject to} \quad &\partial_t \rho + \nabla \cdot (\rho v) = 0,\qquad \rho(0)=\eta \end{aligned} The KL-interaction (1t)DKL(ρη)+tDKL(ρπ)(1-t) D_{\mathrm{KL}}(\rho\|\eta) + t D_{\mathrm{KL}}(\rho\|\pi) has a unique minimizer given by the geometric annealing path. The formal optimality (first-order) system couples a forward continuity equation with a backward Hamilton–Jacobi–Bellman (HJB) equation, enforcing both fidelity to boundary data and smoothness/action minimization subject to fixed start/end measures.

This MFG/control-theoretic framing enables flexible introduction of path-dependent fidelity and regularization criteria beyond what is available in standard OT formulations.

4. Tilted-Path Formulation and Optimization

To overcome pathologies (e.g., “teleportation”) of standard reference paths, DMT introduces a “tilting function” g(x,t)g(x,t). The density path is reparametrized: logρg(x,t)=logμref(x,t)+g(x,t)logZ(t),\log \rho^g(x, t) = \log \mu^{\mathrm{ref}}(x, t) + g(x, t) - \log Z(t), with g(,0)=g(,1)=0g(\cdot,0)=g(\cdot,1)=0 and ρg(0)=η\rho^g(0) = \eta, ρg(1)=π\rho^g(1) = \pi. The optimal control problem is then

minvV,gGvV2+λggG2\min_{v \in \mathcal{V},\, g \in \mathcal{G}} \|v\|_{\mathcal{V}}^2 + \lambda_g \|g\|_{\mathcal{G}}^2

subject to the continuity equation tρg+(vρg)=0\partial_t \rho^g + \nabla \cdot (v \rho^g)=0 and constraints on gg. The spaces V,G\mathcal{V}, \mathcal{G} typically employ Sobolev or reproducing kernel Hilbert space (RKHS) norms to enforce spatial/temporal smoothness; e.g.,

vV2=01v(,t)Hxs2dt,gG2=01g(,t)Hxr2dt.\|v\|^2_{\mathcal{V}} = \int_0^1 \|v(\cdot,t)\|^2_{H^s_x} dt,\qquad \|g\|^2_{\mathcal{G}} = \int_0^1 \|g(\cdot,t)\|^2_{H^r_x} dt.

Equivalently, an augmented Lagrangian can be introduced for deriving necessary optimality conditions in mixed PDE form, leading to a flexible Banach-space optimization framework.

5. Numerical Solution via Gaussian Processes and Collocation

For practical computation, the tilted DMT control problem is discretized via a Gaussian process (GP) and collocation approach:

  • Choose a collocation grid {(xj,tj)}j=1J\{(x_j, t_j)\}_{j=1}^J in Rd×[0,1]\mathbb{R}^d \times [0,1] and select boundary points for t=0,1t=0,1.
  • Model the scalar potential uu (so v=uv = \nabla u) and tilt gg as elements of scalar-valued RKHSs Hu\mathcal{H}_u, Hg\mathcal{H}_g with product kernels Ku,KgK_u,K_g (typically Matérn-type in space/time with lengthscales σx\sigma_x, σt\sigma_t).
  • Enforce the nonlinear residual of the continuity or Fokker–Planck equation at each interior collocation point, i.e., Fj(zj;c)=0F_j(z_j;c)=0, where zjz_j collects the needed derivatives and cc handles time normalization. At boundaries, impose g=0g=0.
  • By the representer theorem, the minimizers u,gu^*,g^* lie in finite spans of kernel sections evaluated or differentiated at grid points.

The empirical optimization reduces to a penalized least-squares problem: minzu,zg,czuKu(ϕ,ϕ)1zu+λgzgKg(ψ,ψ)1zg+λpdej=1JFj(zj;c)2+λbcboundaryg2,\min_{z_u, z_g, c} \quad z_u^\top K_u(\phi, \phi)^{-1}z_u + \lambda_g z_g^\top K_g(\psi, \psi)^{-1}z_g + \lambda_{\mathrm{pde}} \sum_{j=1}^J |F_j(z_j; c)|^2 + \lambda_{\mathrm{bc}} \sum_{\text{boundary}} |g|^2, which is solved via trust-region methods such as Levenberg–Marquardt, using Cholesky parameterization of the Gram matrices.

6. Theoretical Results: Representer Theorem and Well-posedness

Representer theorem for DMT collocation: Given a Hilbert space H\mathcal{H} with kernel KK and a finite set of linear functionals {i}i=1n\{\ell_i\}_{i=1}^n, the RKHS minimizer constrained by i(h)=di\ell_i(h) = d_i is: h=i=1nαiK(,i),whereK(j,i)=j[K(,i)]h^* = \sum_{i=1}^n \alpha_i K(\cdot, \ell_i),\quad \text{where} \quad K(\ell_j, \ell_i) = \ell_j[K(\cdot, \ell_i)] and α\alpha solves K(,)α=dK(\ell, \ell) \alpha = d. This structure ensures all learned objects admit efficient parametrization in terms of kernel sections induced by collocation.

Existence of minimizers: Under mild assumptions (smooth, positive densities for the reference path; strictly positive regularizers λg,λpde,λbc\lambda_g, \lambda_{\mathrm{pde}}, \lambda_{\mathrm{bc}}) the finite-dimensional penalized least-squares problem is coercive and continuous, guaranteeing the existence of minimizers.

These results provide a rigorous foundation for kernel-based and collocation-based implementations and imply provable smoothness of the optimal transport velocity fields and tilts.

7. Empirical Assessment and Sampling Applications

In one-dimensional experiments with reference η=N(0,1)\eta = \mathcal{N}(0, 1) and target π=23N(8,1)+13N(4,1)\pi = \frac{2}{3} \mathcal{N}(-8, 1) + \frac{1}{3} \mathcal{N}(4, 1), the geometric annealing path μref\mu^{\mathrm{ref}} fails to transport mass to the leftmost mode—reflecting the teleportation pathology. Learned velocity fields along this path produce samplers missing regions of the target.

The tilted DMT approach (using the Banach-space control framework and GP solver) avoids teleportation and yields smooth, balanced mass transfer into both modes, demonstrated by:

  • Fraction of trajectories capturing the left mode (true = 0.667): reference 0.005\approx 0.005 vs. tilt-learned 0.375\approx 0.375.
  • Relative error in mean: $1.80$ (ref) vs. $0.88$ (learned).
  • Relative error in variance: $0.96$ (ref) vs. $0.016$ (learned).
  • Kernel MMD: $0.743$ (ref) vs. $0.137$ (learned).
  • The spatial RKHS norm of the learned velocity remains stable under the tilted path, indicating superior regularity.

Trajectory visualizations corroborate the improved spatial smoothness and sampling fidelity achieved by the tilted DMT method relative to analytic McCann velocity or standard reference paths.

Key implementation and application principles include:

  • Regularization: λg\lambda_g balances the scale of the tilt; λpde\lambda_{\mathrm{pde}} and λbc\lambda_{\mathrm{bc}} enforce PDE and boundary condition fidelity.
  • Kernel hyperparameters: σx\sigma_x and σt\sigma_t govern the spatial/temporal smoothness of the velocity and tilt; higher values yield smoother but less flexible paths.
  • Collocation resolution JJ drives tradeoffs between accuracy and computational requirements, with worst-case cubic scaling in kernel matrix assembly and linear solver steps. Inducing point or hierarchical strategies can reduce computational cost.
  • DMT is broadly applicable: generative modeling (continuous normalizing flows, diffusion models), density-driven/annealing samplers, Bayesian inference, obstacle-aware robotic transport (via tilt gg), and finetuning of pretrained generative models.

In the context of non-smooth, infinite-dimensional, or “Wiener-like” spaces, DMT generalizes classical optimal transport and the Otto calculus, leveraging the Cheeger energy and Benamou–Brenier-type dynamic characterizations. Heat semigroups generated by Dirichlet forms—analyzed via the evolution variational inequality (EVI)—admit contractivity and curvature results extending well beyond the L2L^2-Wasserstein theory (Ambrosio et al., 2015). A plausible implication is that DMT provides a unifying mathematical and computational infrastructure for measure-valued dynamics in both classical and highly singular regimes.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)
Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Dynamic Measure Transport (DMT).