Wasserstein-Cramér-Rao Theory
- Wasserstein-Cramér-Rao theory is a framework that recasts the limits of unbiased estimation by using sensitivity, defined via the 2-Wasserstein Riemannian metric, instead of variance.
- It establishes a new Cramér-Rao-type bound—the WCR bound—by leveraging the Wasserstein information matrix and matrix Cauchy–Schwarz inequality on estimator gradients.
- The theory identifies conditions under which transport families and e-geodesics yield sensitivity-efficient estimators, with applications in robust estimation under additive noise.
Wasserstein-Cramér-Rao theory recasts fundamental limits of unbiased statistical estimation by replacing the variance—traditionally analyzed via the Fisher-Rao geometry—with a new notion of sensitivity defined through the 2-Wasserstein Riemannian structure. Sensitivity quantifies the instability of an estimator under infinitesimal additive perturbations rather than resampling variability, leading to an alternative Cramér-Rao-type bound known as the Wasserstein-Cramér-Rao (WCR) lower bound. This framework enables rigorous characterization and attainment criteria for estimators optimized for sensitivity, identifies analogues of exponential families called transport families or e-geodesics, and clarifies the roles of Wasserstein projection estimators and their asymptotic properties.
1. Classical and Wasserstein Geometries in Estimation Theory
In classical parametric estimation, the uncertainty of an unbiased estimator for parameter is captured by its variance
Under differentiability-in-quadratic-mean (DQM), the score function and Fisher information matrix form the basis of the Cramér-Rao bound:
The underlying geometry relies on the Fisher-Rao (equivalently, Hellinger) Riemannian metric on the space of probability laws.
Wasserstein-Cramér-Rao theory pivots to the geometry induced by the 2-Wasserstein metric over , the space of measures with finite second moment. The 2-Wasserstein distance,
admits a formal Riemannian structure, and infinitesimal perturbations of the data are analyzed via transport maps and associated linearizations.
2. Sensitivity, Dirichlet Energy, and Wasserstein Information
Sensitivity of an estimator to additive noise is defined by,
where , . Formal expansion as gives the limiting Dirichlet energy,
The Wasserstein transport linearization for smoothly varying models is
with the Wasserstein information matrix
This matrix plays an analogous role to Fisher information, quantifying the response in Wasserstein geometry to infinitesimal parameter changes.
3. The Wasserstein-Cramér-Rao Inequality
A key result establishes for unbiased for that
and, in scalar parameter cases,
This is the Wasserstein–Cramér–Rao lower bound, which governs the fundamental limit of estimator sensitivity under infinitesimal additive perturbation noise (Trillos et al., 10 Nov 2025, Nishimori et al., 15 Jun 2025). The result follows via formal Riemannian geometry reasoning and a matrix Cauchy–Schwarz inequality applied to gradients of the estimator and the Wasserstein score.
4. Exact Efficiency: Transport Families and E-geodesics
A model is termed a transport family if there exists a potential and parameterization such that
where is invertible.
Sensitivity-efficient estimators are characterized by attaining the WCR bound with equality, i.e.,
for all . In transport families, the estimator
is unbiased for (up to a constant) and is sensitivity-efficient.
In one-parameter models, e-geodesics (in the sense of the Otto metric) further specify that the Wasserstein score does not depend on after monotone reparameterization. Existence of exact Wasserstein-efficient estimators is deeply connected to the geometry of transport families and e-geodesics, unlike the broader scope of exponential families in Fisher-Rao theory (Nishimori et al., 15 Jun 2025).
5. Asymptotic Efficiency and the Wasserstein Projection Estimator
Given , the Wasserstein projection estimator (WPE) is defined as
with , the empirical measure.
Definition: WPE is asymptotically sensitivity-efficient if
in -probability.
In univariate settings () with smooth quantile map ,
and, with bounded support and positive density,
where equals the asymptotic sensitivity bound. For multidimensional cases, analogous asymptotic results require stronger regularity (envelope-theorem-style arguments), but the theory generalizes in principle (Trillos et al., 10 Nov 2025).
6. Examples: Gaussian, Uniform, Laplace, Regression, and Pareto Families
A selection of models illustrates where variance- and sensitivity-efficiency coalesce or diverge.
- Gaussian location ():
. Here, , . Both classical and Wasserstein bounds are $1/n$, and the sample mean precisely attains both.
- Uniform scale on :
, , , . The only unbiased admitting exact sensitivity-efficiency is ; the plug-in estimator attains the WCR bound. The delta-method yields an asymptotically optimal estimator for itself: .
- Laplace location:
The sample median has sensitivity (constant order), but the sample mean has $1/n$ sensitivity, making the mean sensitivity-efficient though not variance-efficient.
- Linear regression (fixed design):
Ordinary least squares (OLS) is sensitivity-efficient.
- Pareto families:
Specific L-statistics achieve exact W-efficiency for certain parameterizations.
A summary table:
| Model | Sensitivity-efficient Estimator | Sensitivity Bound Value |
|---|---|---|
| Gaussian (location) | Sample Mean | $1/n$ |
| Uniform Scale | (for ) | $3/n$ (for , scalar case) |
| Laplace (location) | Sample Mean | |
| Linear Regression | OLS | Computed via Wasserstein info matrix |
7. Context, Limitations, and Relation to Broader Theory
Wasserstein-Cramér-Rao theory is motivated by the need to analyze estimator instability outside the scope of resampling variability. Sensitivity—Dirichlet energy of the estimator—captures the reaction to infinitesimal additive noise and aligns with important practical settings: measurement error models, local differential privacy (noise injection), and distributionally robust optimization in Wasserstein ambiguity sets.
The sensitivity-bound parallels classical theory: while the Fisher-Rao geometry yields variance-based Cramér-Rao limits, the Otto/Wasserstein geometry yields sensitivity limits—sometimes revealing new optimality properties (e.g., new L-statistics) or showing that classical estimators (mean) retain optimality, but not invariably (MLE in non-Gaussian location models may fail sensitivity-optimality).
A plausible implication is that transport geometry may admit further generalization to nonparametric or finite-sample settings, or to alternative instability measures under different metrics (total variation, Sobolev). However, explicit sensitivity-efficient estimators in multivariate and curved/complex families remain an open area, with location-scale families and products of e-geodesics forming the main tractable cases explored thus far. Extensions to higher-order asymptotics, transport-entropic divergences, and generalized transport-exponential structures are identified as promising research directions (Trillos et al., 10 Nov 2025, Nishimori et al., 15 Jun 2025).
8. Comparison with Fisher Information and Exponential Families
Classically, exponential families admit exact attainment of the Cramér-Rao bound for variance, with a wide array of tractable models. In Wasserstein theory, transport families and e-geodesics play a similar role—though the geometric restrictions imposed by optimal transport severely reduce the generality of cases admitting exact sensitivity-efficient estimators. This suggests a sharper dichotomy between variance and sensitivity as robustness criteria, contributing to the broader agenda of statistical inference under geometric and transport-theoretic principles.
Further context is provided by contemporary work (Ay 2024) on Otto connections and the e-geodesics, which deepens the information-geometric perspective for transport metrics and suggests new directions for the structure and analysis of families attaining optimal robustness against additive perturbations.
Sponsored by Paperpile, the PDF & BibTeX manager trusted by top AI labs.
Get 30 days free