PW Barycenter in Optimal Transport
- PW Barycenter is the statistical mean of probability measures in the 2-Wasserstein space, defined via the minimization of expected squared transport distances.
- It employs the averaging of optimal transport maps and convex duality to characterize deformations in imaging and manifold statistics.
- The empirical formulation ensures strong consistency, enabling practical template estimation in applications like neuroimaging and statistical signal analysis.
A PW Barycenter (“Population Wasserstein barycenter”) is a generalized notion of Fréchet mean for probability measures within the nonlinear metric geometry of the 2-Wasserstein space. It is defined as the minimizer of the expected squared Wasserstein distance to a family of probability measures, and can, under general regularity and compactness assumptions, be characterized as the push-forward of a reference measure by the mean of the optimal transport maps arising from an underlying parametric or random model. This perspective connects duality in optimal transport, convex analysis, and statistical averaging of deformations in stochastic modeling and imaging.
1. Definition and Characterization
The population Wasserstein barycenter is the minimizer of the Fréchet functional
where is a parametric family of compactly supported random probability measures, is the distribution of , and is the quadratic Wasserstein distance. The barycenter is thus
A duality argument from optimal transport theory reveals a deeper structure: if is a reference measure and is the optimal transport map from to , then, under suitable conditions, the barycenter has the form
and denotes push-forward. In other words, the barycenter is constructed by pushing forward the reference measure by the expectation of the optimal transport maps with respect to the parameter distribution.
In dimension one, the barycenter’s quantile function is given by the average of the input quantile functions:
2. Mathematical Framework
The barycenter problem is governed by the 2-Wasserstein metric
where is the set of couplings with fixed marginals. The push-forward operation for a measurable map is defined via
By Brenier’s theorem, the optimal transport map exists and is unique under absolute continuity and regularity assumptions: for each , .
The barycenter as push-forward by the average OT map holds under the condition that is itself the optimal map from to each (see Proposition 3.6 and Theorem 3.7 in the paper). In this sense, the averaging of optimal transport maps is central to the characterization.
The dual formulation, based on convex analysis, expresses the barycenter problem as
with . In one dimension, the barycenter is immediately available as the average of quantile functions.
3. Extensions to Statistical and Imaging Models
The paper extends these abstract results to statistical models for signals and images with geometric variability, known as deformable models. For observed random deformations
the observed signals or densities are random push-forwards of a template by diffeomorphisms. In this setting, provided proper integrability and regularity, the barycenter’s density is
with . Thus, the barycenter captures the mean “template” in a deformation-invariant way, simultaneously accounting for geometric and photometric variations.
This explicit formula provides a practical and statistically meaningful solution to template estimation under complex geometric warping, especially in contexts such as neuroimaging, atlas construction, and shape or texture summarization.
4. Estimation and Consistency: Empirical Barycenter
Given i.i.d. random measures , the empirical barycenter is
Under compact support, existence and uniqueness are ensured. The paper establishes strong statistical consistency: as , the empirical barycenter converges in almost surely to the population barycenter. The proof adapts the strong law of large numbers to the Wasserstein setting.
For practical computation, when OT maps from to each can be computed explicitly, one can use the empirical mean map
and approximate the barycenter by .
5. Relationship to Prior Work and Broader Interpretation
This framework generalizes the concept of the empirical barycenter given by Agueh and Carlier (2011) to full population models, extending from finitely many fixed measures to general families of random measures. It also relaxes assumptions (e.g., concerning the admissibility of the class of maps to be averaged), and shows that, provided only that the average of OT maps is compatible with the optimal transport structure, the population barycenter is characterized as push-forward by the mean OT map.
Earlier approaches required stronger admissibility conditions or were only justified in finite, discrete, or one-dimensional settings. This extension is significant both for the theory of optimal transport and for applications in statistics where geometric averaging and averaging of deformations are central.
6. Practical Implications and Summary of Core Results
The characterization of the PW barycenter enables practical algorithms for template estimation in imaging, manifold statistics, and registration tasks, especially when statistical models involve random deformations. The central results may be summarized as follows:
Aspect | Mathematical Statement | Interpretation |
---|---|---|
Population barycenter | Wasserstein Fréchet mean of the distribution | |
Averaged OT maps | , | Barycenter as push-forward by mean transport map |
Empirical barycenter | Sample estimate of barycenter | |
Strong consistency | a.s. as | Empirical barycenter converges to population barycenter |
Conclusion
For broad classes of random probability measures, including models of random geometric deformations prevalent in modern statistical image and signal analysis, the PW barycenter admits a rigorous and practically computable characterization as the push-forward of a reference probability measure by the mean of the optimal transport maps. This approach underpins a principled, geometry-aware statistical averaging scheme for probability distributions, yielding both consistency guarantees and explicit computational strategies for empirical estimation, and generalizes fundamentally the notion of averaging in non-Euclidean spaces.