Papers
Topics
Authors
Recent
Search
2000 character limit reached

Generalized Wasserstein Geometries

Updated 2 April 2026
  • Generalized Wasserstein Geometries are a family of metric structures that expand classical optimal transport to include unbalanced measures, nonlinear projections, and operator-valued data.
  • They leverage innovative approaches such as slicing, Gromov–Wasserstein, and Bregman divergences to compute distances efficiently in complex and non-Euclidean settings.
  • This framework enables robust variational analysis and scalable algorithm design for applications in machine learning, functional data analysis, and statistical inference.

Generalized Wasserstein geometries consist of a broad family of metric structures and optimization frameworks that extend the classical Wasserstein space of probability measures. These generalizations encompass unbalanced transport (allowing for mass creation/destruction), Bregman divergences, slicing-based metrics that exploit nonlinear projections, Gromov–Wasserstein frameworks that are invariant to isometries and reflect structural information, geometric analysis on spaces of SPD matrices and operators, synthetic and barycentric curvature-dimension conditions on abstract measure-metric spaces—including infinite-dimensional and non-smooth settings—as well as iterated and hierarchical constructions. This article surveys major constructions, geometric and variational properties, and recent advances in such generalized optimal transport geometries.

1. Unbalanced and Source-Regularized Wasserstein Geometries

A primary generalization over classical Wasserstein space addresses the limitation that WpW_p is only defined between measures with equal mass. The Piccoli–Rossi metric Wpa,bW_p^{a,b} on finite measures introduces two nonnegative weights aa (mass creation/removal) and bb (transport) and, for μ,νM(Rd)\mu,\nu \in \mathcal M(\mathbb{R}^d),

Wpa,b(μ,ν)=[infμ~μ,ν~ν μ~=ν~ap(μμ~TV+νν~TV)p+bpWp(μ~,ν~)p]1/p.W_p^{a,b}(\mu,\nu) = \left[ \inf_{\substack{\tilde\mu \leq \mu,\,\tilde\nu \leq \nu\ |\tilde\mu|=|\tilde\nu|}}\, a^p\big(\|\mu-\tilde\mu\|_{\rm TV}+\|\nu-\tilde\nu\|_{\rm TV}\big)^p + b^p W_p(\tilde\mu,\tilde\nu)^p\, \right]^{1/p}.

This metric interpolates between strict mass-conservation transport (aa\to\infty) and total variation (bb\to\infty), and is a complete metric on measure spaces, admitting a generalized Benamou–Brenier formula for W2a,bW_2^{a,b} involving both velocities and signed source measures sts_t in the continuity equation,

Wpa,bW_p^{a,b}0

The corresponding action is

Wpa,bW_p^{a,b}1

For Wpa,bW_p^{a,b}2 and Wpa,bW_p^{a,b}3, Wpa,bW_p^{a,b}4 coincides with the flat (bounded-Lipschitz) metric, i.e.,

Wpa,bW_p^{a,b}5

This structure underlies well-posedness for nonlinear transport equations with source terms and extends classical duality and stability—the metric space Wpa,bW_p^{a,b}6 is geodesic and stable under Gromov–Hausdorff convergence (Piccoli et al., 2013, Chung et al., 2019).

2. Sliced, Generalized Sliced, and Differentiable Sliced Wasserstein Distances

Metrics based on slicing project high-dimensional distributions onto lower-dimensional spaces to exploit the efficiency of 1D transport. The sliced Wasserstein metric,

Wpa,bW_p^{a,b}7

where Wpa,bW_p^{a,b}8 is projection onto direction Wpa,bW_p^{a,b}9, is generalized by replacing aa0 with nonlinear or learnable functions aa1. The generalized sliced Wasserstein distance (GSW) for nonlinear aa2 is

aa3

Recent developments provide deterministic and learnable function approximations (polynomials, neural nets), exploiting concentration of high-dimensional random projections, yielding scalability for high aa4 and facilitating moment-based approximations. Differentiable generalized sliced Wasserstein plans (DGSWP) employ a bilevel scheme to select optimal nonlinear projections aa5,

aa6

with gradients efficiently estimated by Gaussian smoothing over aa7 (Le et al., 2022, Chapel et al., 28 May 2025). Both Sliced and Generalized Sliced Wasserstein metrics are true metrics (injectivity conditions on aa8) and admit low-complexity computation.

The min-SWGG proxy leverages generalized Wasserstein geodesics with a line-supported pivot, computing

aa9

which provides a metric that metrizes weak convergence and yields efficient computation and explicit couplings (Mahey et al., 2023).

3. Gromov–Wasserstein and Linearized/Inner-Product Generalizations

Gromov–Wasserstein (GW) distances generalize bb0 to compare metric-measure spaces up to measure-preserving isometry, defined by

bb1

Linearized GW (LGW) and inner-product GW (IGW) geometries are proposed for computational tractability, e.g., LGW leverages barycentric projections and lot-based tangent embeddings, reducing quadratic complexity and retaining isometry-invariance properties (Beier et al., 2021). The IGW metric,

bb2

is analyzed with an associated gradient flow and Riemannian structure; the induced mobility operator modifies the local Wasserstein gradients to encode global structure, with a Benamou–Brenier-like dynamic reformulation and an Otto-calculus-type gradient (Zhang et al., 2024).

4. Generalized Wasserstein Geometries on Metric-Measure and Infinite-Dimensional Spaces

The classical bb3 geometry on probability measures can be generalized to extended metric-measure spaces bb4, including abstract Wiener spaces and configuration spaces over Riemannian manifolds. A new barycentric curvature-dimension condition BCDbb5 is imposed via Jensen-type variational inequalities for entropy at barycenters,

bb6

with bb7. This condition encompasses curvature-dimension properties of the Lott–Sturm–Villani and Ambrosio–Gigli–Savaré theories but is designed to handle branching/non-geodesic and infinite-dimensional settings. Stability under measured Gromov–Hausdorff convergence holds, and existence, uniqueness, and absolute continuity of barycenters are established under mild integrability conditions. Geometric and functional inequalities, including multi-marginal Brunn–Minkowski and functional Blaschke–Santaló inequalities, are obtained directly from the barycentric Jensen inequalities (Han et al., 2024).

Variational structures are lifted via category-theoretic functors to iterated Wasserstein spaces bb8, with velocity plans and geodesics at every level, and gradient flows defined for suitable functionals (Vauthier, 3 Dec 2025). In spaces of all signed Radon measures, the quotient structure under group actions descends naturally to Wasserstein distances and is compatible with Gromov–Hausdorff stability (Chung et al., 2019).

5. Bregman–Wasserstein and Dualistic Information-Geometric Extensions

The Bregman–Wasserstein divergence arises from replacing the quadratic cost in classical OT by a Bregman divergence (generated by a strictly convex function bb9) on μ,νM(Rd)\mu,\nu \in \mathcal M(\mathbb{R}^d)0: μ,νM(Rd)\mu,\nu \in \mathcal M(\mathbb{R}^d)1 where μ,νM(Rd)\mu,\nu \in \mathcal M(\mathbb{R}^d)2 is the canonical Bregman divergence. This framework induces displacement interpolations corresponding to so-called primal and dual geodesics in information geometry, recovers the classical Wasserstein geometry for μ,νM(Rd)\mu,\nu \in \mathcal M(\mathbb{R}^d)3, and transports the dualistic (Amari) geometric structure to infinite-dimensional statistical manifolds. An associated generalized Pythagorean theorem, dual connections (primal, dual, Levi-Civita), and corresponding JKO gradient flows are established (Kainth et al., 2023).

6. SPD Matrix and Operator-Valued Wasserstein Geometries

On the manifold of symmetric positive-definite (SPD) matrices μ,νM(Rd)\mu,\nu \in \mathcal M(\mathbb{R}^d)4, the μ,νM(Rd)\mu,\nu \in \mathcal M(\mathbb{R}^d)5-Wasserstein geometry is given by the metric tensor

μ,νM(Rd)\mu,\nu \in \mathcal M(\mathbb{R}^d)6

where μ,νM(Rd)\mu,\nu \in \mathcal M(\mathbb{R}^d)7 solves μ,νM(Rd)\mu,\nu \in \mathcal M(\mathbb{R}^d)8. The geodesics, exponential and logarithm maps, and explicit positive curvature properties are available (Luo et al., 2020). The Bures–Wasserstein geometry, further generalized as GBW, introduces a metric tensor parameterized by μ,νM(Rd)\mu,\nu \in \mathcal M(\mathbb{R}^d)9, leading to geodesics and distances incorporating a Mahalanobis-type precision weighting. This anisotropic structure permits improved statistical efficiency and conditioning, and is generalized to infinite-dimensional and operator-valued settings—e.g., covariance operators acting on Hilbert spaces—via unitized Hilbert–Schmidt operators and an extended Mahalanobis norm. Operator-valued Procrustes geodesics, learnable regularization parameters, and tractable computational schemes for high-dimensional inference are provided (Han et al., 2021, Goomanee et al., 12 Nov 2025).

7. Geometry on Special Structures and Embeddings

In ultrametric spaces, the Wpa,b(μ,ν)=[infμ~μ,ν~ν μ~=ν~ap(μμ~TV+νν~TV)p+bpWp(μ~,ν~)p]1/p.W_p^{a,b}(\mu,\nu) = \left[ \inf_{\substack{\tilde\mu \leq \mu,\,\tilde\nu \leq \nu\ |\tilde\mu|=|\tilde\nu|}}\, a^p\big(\|\mu-\tilde\mu\|_{\rm TV}+\|\nu-\tilde\nu\|_{\rm TV}\big)^p + b^p W_p(\tilde\mu,\tilde\nu)^p\, \right]^{1/p}.0 Wasserstein geometry collapses to an affine form, admitting an isometric embedding into a convex subset of an Wpa,b(μ,ν)=[infμ~μ,ν~ν μ~=ν~ap(μμ~TV+νν~TV)p+bpWp(μ~,ν~)p]1/p.W_p^{a,b}(\mu,\nu) = \left[ \inf_{\substack{\tilde\mu \leq \mu,\,\tilde\nu \leq \nu\ |\tilde\mu|=|\tilde\nu|}}\, a^p\big(\|\mu-\tilde\mu\|_{\rm TV}+\|\nu-\tilde\nu\|_{\rm TV}\big)^p + b^p W_p(\tilde\mu,\tilde\nu)^p\, \right]^{1/p}.1 Banach space. Geodesics exist only for Wpa,b(μ,ν)=[infμ~μ,ν~ν μ~=ν~ap(μμ~TV+νν~TV)p+bpWp(μ~,ν~)p]1/p.W_p^{a,b}(\mu,\nu) = \left[ \inf_{\substack{\tilde\mu \leq \mu,\,\tilde\nu \leq \nu\ |\tilde\mu|=|\tilde\nu|}}\, a^p\big(\|\mu-\tilde\mu\|_{\rm TV}+\|\nu-\tilde\nu\|_{\rm TV}\big)^p + b^p W_p(\tilde\mu,\tilde\nu)^p\, \right]^{1/p}.2, and otherwise connectivity is via Hölder arcs of exponent Wpa,b(μ,ν)=[infμ~μ,ν~ν μ~=ν~ap(μμ~TV+νν~TV)p+bpWp(μ~,ν~)p]1/p.W_p^{a,b}(\mu,\nu) = \left[ \inf_{\substack{\tilde\mu \leq \mu,\,\tilde\nu \leq \nu\ |\tilde\mu|=|\tilde\nu|}}\, a^p\big(\|\mu-\tilde\mu\|_{\rm TV}+\|\nu-\tilde\nu\|_{\rm TV}\big)^p + b^p W_p(\tilde\mu,\tilde\nu)^p\, \right]^{1/p}.3 (Kloeckner, 2013). For time series and signed measures, the generalized Wasserstein geometry leverages Jordan decompositions and signed Cumulative Distribution Transforms (SCDT), embedding signals into a flat Hilbert space Wpa,b(μ,ν)=[infμ~μ,ν~ν μ~=ν~ap(μμ~TV+νν~TV)p+bpWp(μ~,ν~)p]1/p.W_p^{a,b}(\mu,\nu) = \left[ \inf_{\substack{\tilde\mu \leq \mu,\,\tilde\nu \leq \nu\ |\tilde\mu|=|\tilde\nu|}}\, a^p\big(\|\mu-\tilde\mu\|_{\rm TV}+\|\nu-\tilde\nu\|_{\rm TV}\big)^p + b^p W_p(\tilde\mu,\tilde\nu)^p\, \right]^{1/p}.4 and providing interpretability and straight-line geodesics in classes generated by template deformations (Li et al., 2022).

8. Outlook and Applications

Generalized Wasserstein geometries enable the handling of mass transfer beyond strict conservation, greater expressivity in comparing complex distributions (e.g., via nonlinear projections or matrix-valued data), and robust analysis in non-Euclidean, infinite-dimensional, or branching settings. They are now central to algorithm design in scalable transport, functional data analysis, statistical learning with manifold constraints, configuration and Wiener spaces, and information-geometric optimization. Recent advances unify these constructions into a broader synthetic and categorical framework, facilitating further extension to abstract settings while retaining the core variational, geometric, and computational underpinnings of optimal transport (Piccoli et al., 2013, Han et al., 2024, Kainth et al., 2023, Goomanee et al., 12 Nov 2025, Chapel et al., 28 May 2025, Le et al., 2022).

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Generalized Wasserstein Geometries.