Wasserstein Space in Optimal Transport

Updated 25 February 2026

Wasserstein space is a metric space of probability measures defined via optimal transport, characterized by its complete, separable, and geodesic structure.
It supports displacement interpolation and gradient flows, which are pivotal in applications like statistical learning and computational optimization.
The space also admits a formal differential calculus with Riemannian-like structures, enabling advanced analysis in variational and infinite-dimensional problems.

The Wasserstein space constitutes a foundational structure in modern analysis, probability, optimization, and geometry. It rigorously equips the space of Borel probability measures on a metric space with a metric—the Wasserstein distance—derived from optimal transport. This metric endows the space of measures with rich geometric, topological, and analytic properties, making it a primary object of study for metric geometry, gradient flows, statistical learning, and infinite-dimensional variational analysis.

1. Definition and Basic Properties

Given a Polish metric space $(X, d)$ and $p \in [1, \infty)$ , the $p$ -Wasserstein space $W_p(X)$ is defined as the set of Borel probability measures with finite $p$ th moment: $W_p(X) = \left\{ \mu \in \mathcal{P}(X) : \int_X d(x_0, x)^p\, d\mu(x) < \infty \text{ for some } x_0 \in X \right\}$ The $p$ -Wasserstein distance is

$W_p(\mu, \nu) = \left( \inf_{\pi \in \Pi(\mu, \nu)} \int_{X \times X} d(x,y)^p\, d\pi(x, y) \right)^{1/p}$

where $\Pi(\mu, \nu)$ denotes the set of couplings with marginals $\mu$ and $\nu$ (Karimi et al., 2020).

The metric space $(W_p(X), W_p)$ is complete and separable; in fact, $W_p$ metrizes the weak topology on $\mathcal{P}(X)$ when $X$ is compact or Polish (Gomes et al., 2024). In the Euclidean setting, $W_p(\mathbb{R}^d)$ provides a canonical ground for optimal transport and distributional analysis.

2. Geometric Structure and Geodesics

$(W_p(X), W_p)$ is a length space, i.e., the distance between probability measures can be realized as the infimum of the lengths of curves in the space:

Displacement interpolation: For absolutely continuous measures, there exist canonical constant-speed geodesics; these interpolate measures via optimal transport maps or plans. In $(\mathbb{R}^d, W_2)$ , any optimal coupling $\pi^*$ yields a geodesic curve

$\mu_t = \big((1-t)\,x + t\,y\big)_{\#}\pi^*, \quad t \in [0,1]$

with $W_2(\mu_s, \mu_t) = |t-s|\,W_2(\mu_0, \mu_1)$ (Karimi et al., 2020).

Benamou–Brenier dynamic formulation: The $W_2$ metric admits a dynamic characterization as an infimum over velocity fields $v_t$ and paths $\rho_t$ solving the continuity equation:

$W_2(\mu, \nu)^2 = \inf_{\rho, v} \int_0^1 \int_X |v_t(x)|^2 d\rho_t(x)\,dt,\quad \partial_t \rho_t + \nabla \cdot (\rho_t v_t) = 0,~\rho_0 = \mu,~\rho_1 = \nu$

(Lanzetti et al., 2024, Hamm et al., 2023).

Non-Euclidean geometry: On non-flat or non-Lebesgue base spaces, the Wasserstein space reflects the geometry of $X$ . If $X$ is a Hadamard space, $W_2(X)$ is geodesic and features a well-behaved asymptotic boundary; its large-scale geometry mirrors that of $X$ (Bertrand et al., 2010).

3. Differentiable and Riemannian Structure

The Wasserstein space supports a formal (and, for compact manifolds, rigorous) differential calculus:

Tangent spaces: At an absolutely continuous measure $\mu$ (say, with a $C^\infty$ positive density), the tangent space $T_\mu W_2(M)$ can be identified with the closure in $L^2(M,\mu)$ of gradients of smooth functions, i.e.,

$T_\mu W_2(M) = \overline{\{ \nabla \phi : \phi \in C^\infty(M) \}}^{L^2(M,\mu)}$

(Gomes et al., 2024, Lanzetti et al., 2024).

Riemannian metric: Otto–Lott formalism endows $T_\mu W_2(M)$ with the inner product

$g_\mu(U,V) = \int_M \langle \nabla \phi_U, \nabla \phi_V \rangle_g\,d\mu$

for vector fields $U=\nabla \phi_U,~V=\nabla \phi_V$ (Gomes et al., 2024).

Levi-Civita connection and curvature: Explicit formulas for Christoffel symbols, sectional, and Ricci curvatures are available in the case of closed manifolds and, in particular, for compact Lie groups via Fourier analysis (Gomes et al., 2024).
Continuity equation: Absolutely continuous curves $\mu_t$ in $W_2$ satisfy

$\partial_t \mu_t + \operatorname{div}(\mu_t Z_t) = 0$

for a velocity field $Z_t \in L^2(M, \mu_t; TM)$ (Gomes et al., 2024).

4. Statistical and Algorithmic Applications

The Wasserstein space underpins modern computational and statistical frameworks:

Wasserstein barycenters: The Fréchet mean of measures in $W_2$ minimizes

$\bar{\mu} = \arg\min_{\mu \in W_2} \sum_{i=1}^N \lambda_i W_2^2(\mu, \mu_i)$

Generalizations involve multimarginal optimal transport; uniqueness holds under absolute continuity of at least one measure (Karimi et al., 2020).

Statistical regression and PCA: Geodesic regression and principal component analysis adapt to $W_2$ using multimarginal OT formulations, with practical algorithms based on entropic regularization or gradient-based schemes (Karimi et al., 2020, Bigot et al., 2013).
Manifold learning: Finite-dimensional submanifolds of Wasserstein space, defined via smooth embeddings and pull-back metrics, admit Riemannian-like geometry. Tangent spaces can be recovered by spectral analysis of covariance operators constructed from optimal transport maps between nearby samples (Hamm et al., 2023).
Optimization and gradient flows: Wasserstein gradient descent and mirror descent algorithms generalize classical optimization to measure-valued variables, exploiting Riemannian and Bregman divergences on $W_2$ (Bonet et al., 2024, Lanzetti et al., 2024).

5. Boundary, Asymptotics, and Curvature Phenomena

Wasserstein space inherits and enriches the asymptotic and boundary structure of the base space:

Boundary at infinity: When $X$ is a Hadamard space, the visual boundary of $W_2(X)$ corresponds to the space of probability measures on the metric cone over $\partial X$ , with the cone topology extending $W_2(X)$ to $\overline{W_2(X)}$ (Bertrand et al., 2010).
Busemann functions: Rays and co-rays in $W_p(X)$ admit representation as evolving measures supported on rays of $X$ , and Busemann functions can be defined analogously to Riemannian geometry, encoding asymptotic metric behavior (Zhu et al., 2019).
Curvature and rigidity: $W_2(X)$ is rarely CAT(0) even if $X$ is, but under strict negative curvature, the isometry group of $W_2(X)$ coincides with measure pushforwards by isometries of $X$ , in contrast to the Euclidean case, which admits exotic isometries (Bertrand et al., 2014).
Ultrametric and fractal base spaces: When $X$ is ultrametric, $W_p(X)$ embeds affinely isometrically into a convex subset of $\ell^1$ , exhibiting sharp Hölder connectivity properties. The dimension theory for Wasserstein spaces over fractal $X$ relies on ultrametric skeletons and bi-Lipschitz invariants (Kloeckner, 2013).

6. Advanced Topics and Generalizations

Quotients and shape spaces: Quotienting $W_p(X)$ by the pushforward action of subgroups of $ISO(X)$ yields Wasserstein "shape spaces" of measures modulo isometries, with induced metrics and geodesic structures provided the action is proper and the base space is well-behaved [(Lessel, 22 Oct 2025) (abstract)].
Sliced Wasserstein geometry: The sliced Wasserstein metric, based on the integration of $1$-dimensional Wasserstein distances over random directions, gives an alternative geometry, facilitating computation and providing connections to negative Sobolev norms. However, it lacks a geodesic (length space) structure (Park et al., 2023).
Smooth variational principles: The Wasserstein space supports smooth variational principles, crucial for viscosity solutions of PDEs over measures, employing mollified or sliced metrics, and providing penalization frameworks for infinite-dimensional analysis (Bayraktar et al., 2022).

These structural, geometric, and computational aspects position the Wasserstein space at the interface of analysis, geometry, probability, and data science, with active research spanning from infinite-dimensional geometry to statistical methodology and machine learning.