2-Wasserstein Convergence Results

Updated 3 February 2026

Asymptotic 2-Wasserstein convergence studies fundamental limits in optimal transport by detailing how geometric regularity, spectral properties, and process ergodicity dictate convergence rates.
It employs diverse frameworks including empirical measures, Markov processes, and discretization schemes to quantify the behavior of 2-Wasserstein metrics across stochastic models.
The results have practical implications for high-dimensional statistics, SDE/PDE discretizations, and Bayesian nonparametrics through explicit characterization of convergence speeds and limiting distributions.

Asymptotic 2-Wasserstein Convergence Results

The asymptotic behavior of the quadratic ($2$-Wasserstein) distance between probability measures is fundamental to optimal transport, stochastic processes, high-dimensional statistics, numerical algorithms, and stochastic modeling on geometric domains. The convergence rates, limiting distributions, and structural phenomena underlying $W_2$ convergence are dictated by the interplay between geometric regularity, spectral data, moment structure, process ergodicity, and dimension. Rigorous results span empirical measures, Markov and diffusion processes in manifold and network settings, discretization of SDEs, particle approximations of PDEs on Wasserstein space, and functional CLTs for complex statistics.

1. Core Notions and General Convergence Framework

Let $\mathcal{P}_2(X)$ denote the space of Borel probability measures with finite second moment on a complete separable metric space $(X, d)$ . The 2-Wasserstein metric is

$W_2(\mu, \nu) = \left(\inf_{\pi\in\Pi(\mu, \nu)} \int_{X\times X} d(x, y)^2\, d\pi(x, y)\right)^{1/2},$

where $\Pi(\mu, \nu)$ denotes couplings of $\mu$ and $\nu$ . The rate of $W_2$ convergence for empirical measures, process marginals, or discretization schemes is characterized by various criteria:

Spectral properties of underlying generators or geometric domains (eigenvalues of diffusion operators, Nash or Poincaré inequalities) (Wang, 2024, Wang, 2023).
Dimension (d) and moment conditions, affecting minimax rates for empirical measure convergence (Weed et al., 2017, Chae et al., 2020, Borda, 2021).
Coupling, contraction, and Lyapunov conditions for Markov processes or PDE/ODE semigroups (Bolley et al., 2011, Arapostathis et al., 2019, Liu et al., 2023, Gao et al., 2024).
Discrete-to-continuum or manifold-to-network limit structures, affecting convergence on metric graphs and thickened domains (Burger et al., 20 Jan 2026).
Tail, Diophantine, and mixing properties in the presence of randomness, nonstationarity, or arithmetic structure (Wu et al., 2024, Li et al., 2022).

2. Asymptotic Rates: Empirical, Geometric, and Spectral Regimes

The sharp asymptotics for $W_2(\mu_n, \mu)$ when $\mu_n$ is the empirical measure of $n$ i.i.d. samples from a regular measure supported on a $d$ -dimensional manifold, metric space, or fractal, are as follows:

i.i.d. in $\mathbb{R}^d$ or smooth compact $d$ -manifold:

$\mathbb{E} W_2(\mu_n, \mu) = \Theta(n^{-1/d}),\quad d>2, \qquad \mathbb{E} W_2(\mu_n, \mu) \sim \sqrt{\frac{\log n}{n}},\quad d=2$

These rates are sharp and attained for absolutely continuous measures satisfying mild lower density bounds (Weed et al., 2017, Borda, 2021).

Empirical measures on noncompact or boundary domains: The exponent reflects the interplay of volume growth, spectral gap, and geometry, including the presence of a boundary (which can introduce log-corrections at critical dimensions) (Li et al., 2022, Wang, 2024).
Spectral structure for diffusions and subordination: For ergodic (possibly reflecting or killed) diffusions with generator $L$ and eigenvalues $\{\lambda_i\}$ , the large-time asymptotics for empirical measures is:

$\mathbb{E}^{\nu}[W_2(\mu_t, \mu)^2] \sim \begin{cases} t^{-1}, & d\leq 3 \ t^{-1} \log t, & d=4 \ t^{-2/(d-2)} & d\geq 5 \end{cases}$

with further normalization leading to explicit spectral constants and CLT-type limiting laws (Wang, 2024, Wang, 2023).

Setting	Rate $W_2$	Critical Condition
i.i.d. on $d$ -manifold ( $d>2$ )	$n^{-1/d}$	Full support, regular measure
Ergodic diffusion, $d\leq3$	$t^{-1/2}$	Spectral gap, convex or empty boundary
Subordinated process, index $a$	$t^{-(d-2a)/(q+2a)}$	Small-time index $a$ , dimension $d$
Random walk (irrational rotation, torus)	$n^{-1/2}$ , $n^{-\frac{\beta y}{4}}$	Diophantine and characteristic function
Empirical measure (compact group, $d$ )	$n^{-1/d}$	Spectral gap or semisimple

3. Stochastic Processes, Markov Chains, and PDE/SDE Discretizations

Diffusions and Fokker–Planck equations: For solutions $\mu_t$ to $\partial_t \mu_t = \nabla \cdot [\nabla \mu_t + \mu_t A]$ with stationary distribution $v(dx) = e^{-V(x)}dx$ , exponential $W_2$ convergence is assured if, for some $\lambda > 0$ ,

$W_2^2(\mu, v) \leq \frac{1}{\lambda} \int (\nabla \phi - x) \cdot (\nabla \log \mu + A) d\mu$

(the so-called $WJ$ -inequality), with decay $W_2(\mu_t, v) \leq e^{-\lambda t} W_2(\mu_0, v)$ (Bolley et al., 2011).

Markov processes with weaker dissipativity: Under Foster–Lyapunov or nonuniform drift conditions, subexponential ergodicity arises:

$W_2(\delta_x P^t, \pi) \lesssim V(x) \exp(-\kappa t^\alpha),\quad 0<\alpha<1,$

while uniform dissipativity yields exponential decay (Arapostathis et al., 2019).

Euler schemes and interacting particle systems: Euler–Maruyama discretizations of contractive diffusions, as well as mean-field particle systems, admit explicit $L^2$ -Wasserstein contraction rates:

$W_2(\mu Q^n, \nu Q^n) \leq M e^{-\lambda n h} W_2(\mu, \nu)$

under a one-sided dissipativity at infinity and sufficiently high diffusivity. These rates propagate to discrete Poincaré inequalities, exponential decay in KL and total variation, and uniform-in- $N$ propagation of chaos in mean-field systems (Liu et al., 2023).

Langevin Monte Carlo and discretized SDEs: For strong log-concavity ( $U$ $m$ -strongly convex), the Euler-Langevin chain $X_{k+1} = X_k - h \nabla U(X_k) + \sqrt{2h} Z_{k+1}$ satisfies

$W_2(\nu_k, \pi) \leq (1 - m h)^k W_2(\nu_0, \pi) + \frac{\sqrt{2L d h}}{m}$

where the asymptotic bias is $\mathcal{O}(\sqrt{h d} / m)$ (Bonis, 2016).

4. Empirical and Posterior Measures: Statistical Rates and Limit Theorems

Empirical measure convergence: The sharp rate for empirical measures in $W_2$ for regular measures on compact $d$ -dimensional sets is $n^{-1/d}$ , achieved both in expectation and almost surely (Weed et al., 2017, Borda, 2021).
Posterior contraction rates in Bayesian nonparametrics: For priors satisfying Kullback–Leibler support and tail conditions,

$W_2(\Pi_n, P_0) \lesssim n^{-1/5} (\log n),$

holds for Dirichlet process mixtures, with the rate and conditions precisely dictated by moment controls and prior concentration (Chae et al., 2020).

Functional CLT for $W_2$ : Sufficient smoothing (by convolving measures with a Gaussian kernel) raises the $W_2$ problem to a parametric convergence regime ( $n^{-1/2}$ ), with a functional delta method yielding asymptotic normality in suitable Sobolev dual spaces (Goldfeld et al., 2022).
$U$ -statistics and 2-Wasserstein convergence: Asymptotic normality (i.e., the standard CLT) for $U$ -statistics is equivalent to convergence in $W_2$ under minimal mixing and moment conditions, leveraging the equivalence $W_2(Y_n, Y) \to 0 \iff Y_n \Rightarrow Y$ and $\mathbb{E}Y_n^2 \to \mathbb{E}Y^2$ (Kroll, 2024).

5. Manifolds, Networks, Discrete Structures, and Non-Standard Domains

Convergence on metric graphs and thickened domains: Static $W_2$ transport costs on 3D Minkowski-thickened networks $N^D$ collapse to the limiting graph Wasserstein metric on $\mathcal{G}$ (the "skeleton") with explicit quadratic error $|OT^D - W^2_\mathcal{G}(μ^0, ν^0)| = O(D^2)$ , including convergence of optimal couplings under atomlessness and uniqueness (Burger et al., 20 Jan 2026).
Empirical rates on compact manifolds and groups: Berry-Esseen smoothing combined with spectral (Fourier/Peter-Weyl) decompositions gives explicit empirical and random walk $W_2$ rates matching the unit-cube optimal matching exponents (Borda, 2021, Wang, 2024).
Subordinated and non-symmetric diffusions: For empirical measures of non-symmetric or subordinated processes on manifolds, spectral expansions yield precise $t\to\infty$ limits and multi-regime power law decay:

$W_2^2(\mu_t, \mu) = O(t^{-1}),\, t^{-1}\log t,\, t^{-2/(d-2)}$

with constants dependent on the subordinator index, manifold dimension, and spectral data (Wang, 2023, Li et al., 2022, Wang, 2024).

Particle system approximations of Wasserstein PDEs: For particle discretizations of second-order PDEs on Wasserstein space, the analytic comparison yields convergence rate

$\sup_{t,x} |v^N(t,x) - v(t, \mu^x)| \leq C \alpha(N), \quad \alpha(N) = \begin{cases} N^{-1/2} & d<2 \ N^{-1/2} \log N & d=2 \ N^{-1/d} & d>2 \end{cases}$

where $v^N$ and $v$ are value functions solving the finite-particle and Wasserstein-space PDEs, respectively (Bayraktar et al., 2024).

6. Smoothed and Dynamic Transport: Heat Flows, Smoothing, and ODE Flows

Heat semigroup and Gaussian smoothing: On manifolds of positive Ricci curvature, $W_2$ contracts exponentially under the heat flow. On flat Euclidean space, only polynomial decay is observed:

$W_2(P_t μ, P_t ν) \sim C_{n+1} t^{-(n+1)},$

when moments of $μ$ and $ν$ agree to order $n$ , with explicit constants computable via Hermite expansions and Ornstein–Uhlenbeck operators (Chen et al., 2020). This demonstrates that moment-friction dominates in the absence of geometric contraction.

Probability flow ODEs (score-based generative models): For score ODEs satisfying strong log-concavity of $-\log p_0$ , continuous-time $W_2$ decay is exponential with explicit contraction rate $\rho$ governed by drift/diffusion coefficients and log-concavity constants:

$W_2(\mathrm{Law}(X_t), \mu) \leq e^{-\rho t} W_2(\mathrm{Law}(X_0), \mu)$

with non-asymptotic discretization and score error corrections, leading to optimal iteration complexity scaling as $\widetilde O(\sqrt{d}/\varepsilon)$ (Gao et al., 2024).

7. Geometric, Arithmetic, and Mixing Constraints

Arithmetic effects (irrational rotations/random walks on torus): The $W_2$ convergence rate for empirical measures of $\{S_n \alpha\}$ depends simultaneously on Diophantine type $y$ of $\alpha$ and H\"older exponent $\beta$ of the jump characteristic function, exhibiting a phase transition:

$\mathbb{E} W_2(\mu_n, \mu_\infty) \lesssim \begin{cases} n^{-1/2} & \beta y < 2\ n^{-1/2} (\log n)^{1/2} & \beta y = 2\ n^{-\beta y/4} & \beta y > 2 \end{cases}$

with matching lower bounds and criticality in the parameter $\beta y$ (Wu et al., 2024).

Mixing and dependence: Weak mixing (e.g., stationary processes with polynomially decaying $\beta$ -mixing coefficients) perturb empirical rates by additive error terms, but leading exponents persist unless mixing degenerates completely (Borda, 2021, Gao et al., 2024).

These results establish a unified quantitative foundation for $2$-Wasserstein convergence in models ranging from empirical statistics and Bayesian inference to stochastic PDEs, Monte Carlo algorithms, and mean-field limits. The dimension, geometric or spectral regularity, dissipativity/mixing, and analytic structure of the underlying space or process are decisive in both attainable rates and limit laws, with optimal transport theory supplying the central analytic and coupling tools.