Papers
Topics
Authors
Recent
Search
2000 character limit reached

2-Wasserstein Convergence Results

Updated 3 February 2026
  • Asymptotic 2-Wasserstein convergence studies fundamental limits in optimal transport by detailing how geometric regularity, spectral properties, and process ergodicity dictate convergence rates.
  • It employs diverse frameworks including empirical measures, Markov processes, and discretization schemes to quantify the behavior of 2-Wasserstein metrics across stochastic models.
  • The results have practical implications for high-dimensional statistics, SDE/PDE discretizations, and Bayesian nonparametrics through explicit characterization of convergence speeds and limiting distributions.

Asymptotic 2-Wasserstein Convergence Results

The asymptotic behavior of the quadratic ($2$-Wasserstein) distance between probability measures is fundamental to optimal transport, stochastic processes, high-dimensional statistics, numerical algorithms, and stochastic modeling on geometric domains. The convergence rates, limiting distributions, and structural phenomena underlying W2W_2 convergence are dictated by the interplay between geometric regularity, spectral data, moment structure, process ergodicity, and dimension. Rigorous results span empirical measures, Markov and diffusion processes in manifold and network settings, discretization of SDEs, particle approximations of PDEs on Wasserstein space, and functional CLTs for complex statistics.

1. Core Notions and General Convergence Framework

Let P2(X)\mathcal{P}_2(X) denote the space of Borel probability measures with finite second moment on a complete separable metric space (X,d)(X, d). The 2-Wasserstein metric is

W2(μ,ν)=(infπΠ(μ,ν)X×Xd(x,y)2dπ(x,y))1/2,W_2(\mu, \nu) = \left(\inf_{\pi\in\Pi(\mu, \nu)} \int_{X\times X} d(x, y)^2\, d\pi(x, y)\right)^{1/2},

where Π(μ,ν)\Pi(\mu, \nu) denotes couplings of μ\mu and ν\nu. The rate of W2W_2 convergence for empirical measures, process marginals, or discretization schemes is characterized by various criteria:

2. Asymptotic Rates: Empirical, Geometric, and Spectral Regimes

The sharp asymptotics for W2(μn,μ)W_2(\mu_n, \mu) when μn\mu_n is the empirical measure of nn i.i.d. samples from a regular measure supported on a dd-dimensional manifold, metric space, or fractal, are as follows:

  • i.i.d. in Rd\mathbb{R}^d or smooth compact dd-manifold:

EW2(μn,μ)=Θ(n1/d),d>2,EW2(μn,μ)lognn,d=2\mathbb{E} W_2(\mu_n, \mu) = \Theta(n^{-1/d}),\quad d>2, \qquad \mathbb{E} W_2(\mu_n, \mu) \sim \sqrt{\frac{\log n}{n}},\quad d=2

These rates are sharp and attained for absolutely continuous measures satisfying mild lower density bounds (Weed et al., 2017, Borda, 2021).

  • Empirical measures on noncompact or boundary domains: The exponent reflects the interplay of volume growth, spectral gap, and geometry, including the presence of a boundary (which can introduce log-corrections at critical dimensions) (Li et al., 2022, Wang, 2024).
  • Spectral structure for diffusions and subordination: For ergodic (possibly reflecting or killed) diffusions with generator LL and eigenvalues {λi}\{\lambda_i\}, the large-time asymptotics for empirical measures is:

Eν[W2(μt,μ)2]{t1,d3 t1logt,d=4 t2/(d2)d5\mathbb{E}^{\nu}[W_2(\mu_t, \mu)^2] \sim \begin{cases} t^{-1}, & d\leq 3 \ t^{-1} \log t, & d=4 \ t^{-2/(d-2)} & d\geq 5 \end{cases}

with further normalization leading to explicit spectral constants and CLT-type limiting laws (Wang, 2024, Wang, 2023).

Setting Rate W2W_2 Critical Condition
i.i.d. on dd-manifold (d>2d>2) n1/dn^{-1/d} Full support, regular measure
Ergodic diffusion, d3d\leq3 t1/2t^{-1/2} Spectral gap, convex or empty boundary
Subordinated process, index aa t(d2a)/(q+2a)t^{-(d-2a)/(q+2a)} Small-time index aa, dimension dd
Random walk (irrational rotation, torus) n1/2n^{-1/2}, nβy4n^{-\frac{\beta y}{4}} Diophantine and characteristic function
Empirical measure (compact group, dd) n1/dn^{-1/d} Spectral gap or semisimple

3. Stochastic Processes, Markov Chains, and PDE/SDE Discretizations

  • Diffusions and Fokker–Planck equations: For solutions μt\mu_t to tμt=[μt+μtA]\partial_t \mu_t = \nabla \cdot [\nabla \mu_t + \mu_t A] with stationary distribution v(dx)=eV(x)dxv(dx) = e^{-V(x)}dx, exponential W2W_2 convergence is assured if, for some λ>0\lambda > 0,

W22(μ,v)1λ(ϕx)(logμ+A)dμW_2^2(\mu, v) \leq \frac{1}{\lambda} \int (\nabla \phi - x) \cdot (\nabla \log \mu + A) d\mu

(the so-called WJWJ-inequality), with decay W2(μt,v)eλtW2(μ0,v)W_2(\mu_t, v) \leq e^{-\lambda t} W_2(\mu_0, v) (Bolley et al., 2011).

  • Markov processes with weaker dissipativity: Under Foster–Lyapunov or nonuniform drift conditions, subexponential ergodicity arises:

W2(δxPt,π)V(x)exp(κtα),0<α<1,W_2(\delta_x P^t, \pi) \lesssim V(x) \exp(-\kappa t^\alpha),\quad 0<\alpha<1,

while uniform dissipativity yields exponential decay (Arapostathis et al., 2019).

  • Euler schemes and interacting particle systems: Euler–Maruyama discretizations of contractive diffusions, as well as mean-field particle systems, admit explicit L2L^2-Wasserstein contraction rates:

W2(μQn,νQn)MeλnhW2(μ,ν)W_2(\mu Q^n, \nu Q^n) \leq M e^{-\lambda n h} W_2(\mu, \nu)

under a one-sided dissipativity at infinity and sufficiently high diffusivity. These rates propagate to discrete Poincaré inequalities, exponential decay in KL and total variation, and uniform-in-NN propagation of chaos in mean-field systems (Liu et al., 2023).

  • Langevin Monte Carlo and discretized SDEs: For strong log-concavity (UU mm-strongly convex), the Euler-Langevin chain Xk+1=XkhU(Xk)+2hZk+1X_{k+1} = X_k - h \nabla U(X_k) + \sqrt{2h} Z_{k+1} satisfies

W2(νk,π)(1mh)kW2(ν0,π)+2LdhmW_2(\nu_k, \pi) \leq (1 - m h)^k W_2(\nu_0, \pi) + \frac{\sqrt{2L d h}}{m}

where the asymptotic bias is O(hd/m)\mathcal{O}(\sqrt{h d} / m) (Bonis, 2016).

4. Empirical and Posterior Measures: Statistical Rates and Limit Theorems

  • Empirical measure convergence: The sharp rate for empirical measures in W2W_2 for regular measures on compact dd-dimensional sets is n1/dn^{-1/d}, achieved both in expectation and almost surely (Weed et al., 2017, Borda, 2021).
  • Posterior contraction rates in Bayesian nonparametrics: For priors satisfying Kullback–Leibler support and tail conditions,

W2(Πn,P0)n1/5(logn),W_2(\Pi_n, P_0) \lesssim n^{-1/5} (\log n),

holds for Dirichlet process mixtures, with the rate and conditions precisely dictated by moment controls and prior concentration (Chae et al., 2020).

  • Functional CLT for W2W_2: Sufficient smoothing (by convolving measures with a Gaussian kernel) raises the W2W_2 problem to a parametric convergence regime (n1/2n^{-1/2}), with a functional delta method yielding asymptotic normality in suitable Sobolev dual spaces (Goldfeld et al., 2022).
  • UU-statistics and 2-Wasserstein convergence: Asymptotic normality (i.e., the standard CLT) for UU-statistics is equivalent to convergence in W2W_2 under minimal mixing and moment conditions, leveraging the equivalence W2(Yn,Y)0    YnYW_2(Y_n, Y) \to 0 \iff Y_n \Rightarrow Y and EYn2EY2\mathbb{E}Y_n^2 \to \mathbb{E}Y^2 (Kroll, 2024).

5. Manifolds, Networks, Discrete Structures, and Non-Standard Domains

  • Convergence on metric graphs and thickened domains: Static W2W_2 transport costs on 3D Minkowski-thickened networks NDN^D collapse to the limiting graph Wasserstein metric on G\mathcal{G} (the "skeleton") with explicit quadratic error OTDWG2(μ0,ν0)=O(D2)|OT^D - W^2_\mathcal{G}(μ^0, ν^0)| = O(D^2), including convergence of optimal couplings under atomlessness and uniqueness (Burger et al., 20 Jan 2026).
  • Empirical rates on compact manifolds and groups: Berry-Esseen smoothing combined with spectral (Fourier/Peter-Weyl) decompositions gives explicit empirical and random walk W2W_2 rates matching the unit-cube optimal matching exponents (Borda, 2021, Wang, 2024).
  • Subordinated and non-symmetric diffusions: For empirical measures of non-symmetric or subordinated processes on manifolds, spectral expansions yield precise tt\to\infty limits and multi-regime power law decay:

W22(μt,μ)=O(t1),t1logt,t2/(d2)W_2^2(\mu_t, \mu) = O(t^{-1}),\, t^{-1}\log t,\, t^{-2/(d-2)}

with constants dependent on the subordinator index, manifold dimension, and spectral data (Wang, 2023, Li et al., 2022, Wang, 2024).

  • Particle system approximations of Wasserstein PDEs: For particle discretizations of second-order PDEs on Wasserstein space, the analytic comparison yields convergence rate

supt,xvN(t,x)v(t,μx)Cα(N),α(N)={N1/2d<2 N1/2logNd=2 N1/dd>2\sup_{t,x} |v^N(t,x) - v(t, \mu^x)| \leq C \alpha(N), \quad \alpha(N) = \begin{cases} N^{-1/2} & d<2 \ N^{-1/2} \log N & d=2 \ N^{-1/d} & d>2 \end{cases}

where vNv^N and vv are value functions solving the finite-particle and Wasserstein-space PDEs, respectively (Bayraktar et al., 2024).

6. Smoothed and Dynamic Transport: Heat Flows, Smoothing, and ODE Flows

  • Heat semigroup and Gaussian smoothing: On manifolds of positive Ricci curvature, W2W_2 contracts exponentially under the heat flow. On flat Euclidean space, only polynomial decay is observed:

W2(Ptμ,Ptν)Cn+1t(n+1),W_2(P_t μ, P_t ν) \sim C_{n+1} t^{-(n+1)},

when moments of μμ and νν agree to order nn, with explicit constants computable via Hermite expansions and Ornstein–Uhlenbeck operators (Chen et al., 2020). This demonstrates that moment-friction dominates in the absence of geometric contraction.

  • Probability flow ODEs (score-based generative models): For score ODEs satisfying strong log-concavity of logp0-\log p_0, continuous-time W2W_2 decay is exponential with explicit contraction rate ρ\rho governed by drift/diffusion coefficients and log-concavity constants:

W2(Law(Xt),μ)eρtW2(Law(X0),μ)W_2(\mathrm{Law}(X_t), \mu) \leq e^{-\rho t} W_2(\mathrm{Law}(X_0), \mu)

with non-asymptotic discretization and score error corrections, leading to optimal iteration complexity scaling as O~(d/ε)\widetilde O(\sqrt{d}/\varepsilon) (Gao et al., 2024).

7. Geometric, Arithmetic, and Mixing Constraints

  • Arithmetic effects (irrational rotations/random walks on torus): The W2W_2 convergence rate for empirical measures of {Snα}\{S_n \alpha\} depends simultaneously on Diophantine type yy of α\alpha and H\"older exponent β\beta of the jump characteristic function, exhibiting a phase transition:

EW2(μn,μ){n1/2βy<2 n1/2(logn)1/2βy=2 nβy/4βy>2\mathbb{E} W_2(\mu_n, \mu_\infty) \lesssim \begin{cases} n^{-1/2} & \beta y < 2\ n^{-1/2} (\log n)^{1/2} & \beta y = 2\ n^{-\beta y/4} & \beta y > 2 \end{cases}

with matching lower bounds and criticality in the parameter βy\beta y (Wu et al., 2024).

  • Mixing and dependence: Weak mixing (e.g., stationary processes with polynomially decaying β\beta-mixing coefficients) perturb empirical rates by additive error terms, but leading exponents persist unless mixing degenerates completely (Borda, 2021, Gao et al., 2024).

These results establish a unified quantitative foundation for $2$-Wasserstein convergence in models ranging from empirical statistics and Bayesian inference to stochastic PDEs, Monte Carlo algorithms, and mean-field limits. The dimension, geometric or spectral regularity, dissipativity/mixing, and analytic structure of the underlying space or process are decisive in both attainable rates and limit laws, with optimal transport theory supplying the central analytic and coupling tools.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (17)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Asymptotic 2-Wasserstein Convergence Results.