Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
134 tokens/sec
GPT-4o
10 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

PW Barycenter in Optimal Transport

Updated 2 July 2025
  • PW Barycenter is the statistical mean of probability measures in the 2-Wasserstein space, defined via the minimization of expected squared transport distances.
  • It employs the averaging of optimal transport maps and convex duality to characterize deformations in imaging and manifold statistics.
  • The empirical formulation ensures strong consistency, enabling practical template estimation in applications like neuroimaging and statistical signal analysis.

A PW Barycenter (“Population Wasserstein barycenter”) is a generalized notion of Fréchet mean for probability measures within the nonlinear metric geometry of the 2-Wasserstein space. It is defined as the minimizer of the expected squared Wasserstein distance to a family of probability measures, and can, under general regularity and compactness assumptions, be characterized as the push-forward of a reference measure by the mean of the optimal transport maps arising from an underlying parametric or random model. This perspective connects duality in optimal transport, convex analysis, and statistical averaging of deformations in stochastic modeling and imaging.

1. Definition and Characterization

The population Wasserstein barycenter is the minimizer μ\mu^* of the Fréchet functional

J(ν)=Θ12dW22(ν,μθ)g(θ)dθ,J(\nu) = \int_{\Theta} \frac{1}{2} d_{W_2}^2(\nu, \mu_\theta) \, g(\theta)\, d\theta,

where {μθ}θΘ\{\mu_\theta\}_{\theta\in\Theta} is a parametric family of compactly supported random probability measures, g(θ)g(\theta) is the distribution of θ\theta, and dW2d_{W_2} is the quadratic Wasserstein distance. The barycenter is thus

μ=argminνJ(ν).\mu^* = \operatorname{argmin}_\nu J(\nu).

A duality argument from optimal transport theory reveals a deeper structure: if μ0\mu_0 is a reference measure and TθT_\theta is the optimal transport map from μ0\mu_0 to μθ\mu_\theta, then, under suitable conditions, the barycenter has the form

μ=Tˉ#μ0,whereTˉ(x)=ΘTθ(x)g(θ)dθ,\mu^* = \bar{T} \# \mu_0, \quad \text{where} \quad \bar{T}(x) = \int_\Theta T_\theta(x)\, g(\theta)\, d\theta,

and #\# denotes push-forward. In other words, the barycenter is constructed by pushing forward the reference measure by the expectation of the optimal transport maps with respect to the parameter distribution.

In dimension one, the barycenter’s quantile function is given by the average of the input quantile functions: Fμ1(y)=ΘFμθ1(y)g(θ)dθ.F_{\mu^*}^{-1}(y) = \int_\Theta F_{\mu_\theta}^{-1}(y) g(\theta) d\theta.

2. Mathematical Framework

The barycenter problem is governed by the 2-Wasserstein metric

dW22(μ,ν)=infγΠ(μ,ν)xy2dγ(x,y),d_{W_2}^2(\mu, \nu) = \inf_{\gamma \in \Pi(\mu, \nu)} \int |x - y|^2\, d\gamma(x, y),

where Π(μ,ν)\Pi(\mu, \nu) is the set of couplings with fixed marginals. The push-forward operation for a measurable map TT is defined via

f(x)d(T#μ)(x)=f(T(x))dμ(x).\int f(x)\, d(T\#\mu)(x) = \int f(T(x))\, d\mu(x).

By Brenier’s theorem, the optimal transport map exists and is unique under absolute continuity and regularity assumptions: for each θ\theta, μθ=Tθ#μ0\mu_\theta = T_\theta\#\mu_0.

The barycenter as push-forward by the average OT map holds under the condition that TθTˉ1T_\theta \circ \bar{T}^{-1} is itself the optimal map from μ\mu^* to each μθ\mu_\theta (see Proposition 3.6 and Theorem 3.7 in the paper). In this sense, the averaging of optimal transport maps is central to the characterization.

The dual formulation, based on convex analysis, expresses the barycenter problem as

JP:=infνΘ12dW22(ν,μθ)g(θ)dθ=sup{ΘΩSg(θ)fθ(x)dμθ(x)dθ:Θfθ(x)dθ=0 x}J_P := \inf_\nu \int_\Theta \frac{1}{2} d_{W_2}^2(\nu, \mu_\theta) g(\theta) d\theta = \sup \left\{ \int_\Theta \int_\Omega S_{g(\theta)} f_\theta(x) d\mu_\theta(x) d\theta : \int_\Theta f_\theta(x) d\theta = 0 \ \forall x \right\}

with Sg(θ)f(x)=infyg(θ)2xy2f(y)S_{g(\theta)}f(x) = \inf_y \frac{g(\theta)}{2}|x-y|^2 - f(y). In one dimension, the barycenter is immediately available as the average of quantile functions.

3. Extensions to Statistical and Imaging Models

The paper extends these abstract results to statistical models for signals and images with geometric variability, known as deformable models. For observed random deformations

Xi(x)=h(φi1(x)),qi(x)=detDφi1(x)q0(φi1(x)),X_i(x) = h(\varphi_i^{-1}(x)), \qquad q_i(x) = |\det D\varphi_i^{-1}(x)| q_0(\varphi_i^{-1}(x)),

the observed signals or densities are random push-forwards of a template by diffeomorphisms. In this setting, provided proper integrability and regularity, the barycenter’s density is

q(x)=detDφ1(x)q0(φ1(x)),\overline{q}(x) = |\det D\overline{\varphi}^{-1}(x)| q_0(\overline{\varphi}^{-1}(x)),

with φ(x)=Eθφθ(x)\overline{\varphi}(x) = \mathbb{E}_\theta \varphi_\theta(x). Thus, the barycenter captures the mean “template” in a deformation-invariant way, simultaneously accounting for geometric and photometric variations.

This explicit formula provides a practical and statistically meaningful solution to template estimation under complex geometric warping, especially in contexts such as neuroimaging, atlas construction, and shape or texture summarization.

4. Estimation and Consistency: Empirical Barycenter

Given nn i.i.d. random measures μθ1,,μθn\mu_{\theta_1},\ldots,\mu_{\theta_n}, the empirical barycenter is

μn=argminν1nj=1n12dW22(ν,μθj).\overline{\mu}_n = \operatorname{argmin}_\nu \frac{1}{n} \sum_{j=1}^n \frac{1}{2} d_{W_2}^2(\nu, \mu_{\theta_j}).

Under compact support, existence and uniqueness are ensured. The paper establishes strong statistical consistency: as nn\to\infty, the empirical barycenter converges in W2W_2 almost surely to the population barycenter. The proof adapts the strong law of large numbers to the Wasserstein setting.

For practical computation, when OT maps TθjT_{\theta_j} from μ0\mu_0 to each μθj\mu_{\theta_j} can be computed explicitly, one can use the empirical mean map

Tn(x)=1nj=1nTθj(x)\overline{T}_n(x) = \frac{1}{n} \sum_{j=1}^n T_{\theta_j}(x)

and approximate the barycenter by Tn#μ0\overline{T}_n\#\mu_0.

5. Relationship to Prior Work and Broader Interpretation

This framework generalizes the concept of the empirical barycenter given by Agueh and Carlier (2011) to full population models, extending from finitely many fixed measures to general families of random measures. It also relaxes assumptions (e.g., concerning the admissibility of the class of maps to be averaged), and shows that, provided only that the average of OT maps is compatible with the optimal transport structure, the population barycenter is characterized as push-forward by the mean OT map.

Earlier approaches required stronger admissibility conditions or were only justified in finite, discrete, or one-dimensional settings. This extension is significant both for the theory of optimal transport and for applications in statistics where geometric averaging and averaging of deformations are central.

6. Practical Implications and Summary of Core Results

The characterization of the PW barycenter enables practical algorithms for template estimation in imaging, manifold statistics, and registration tasks, especially when statistical models involve random deformations. The central results may be summarized as follows:

Aspect Mathematical Statement Interpretation
Population barycenter μ=argminν12dW22(ν,μθ)dP(θ)\mu^* = \operatorname{argmin}_\nu \int \frac{1}{2} d_{W_2}^2(\nu, \mu_\theta)\, dP(\theta) Wasserstein Fréchet mean of the distribution
Averaged OT maps μ=Tˉ#μ0\mu^* = \bar{T}\#\mu_0, Tˉ(x)=ETθ(x)\bar{T}(x)=\mathbb{E}T_\theta(x) Barycenter as push-forward by mean transport map
Empirical barycenter μn=argminν1nj=1n12dW22(ν,μθj)\overline{\mu}_n = \operatorname{argmin}_\nu \frac{1}{n} \sum_{j=1}^n \frac{1}{2} d_{W_2}^2(\nu, \mu_{\theta_j}) Sample estimate of barycenter
Strong consistency dW2(μn,μ)0d_{W_2}(\overline{\mu}_n, \mu^*) \to 0 a.s. as nn \to \infty Empirical barycenter converges to population barycenter

Conclusion

For broad classes of random probability measures, including models of random geometric deformations prevalent in modern statistical image and signal analysis, the PW barycenter admits a rigorous and practically computable characterization as the push-forward of a reference probability measure by the mean of the optimal transport maps. This approach underpins a principled, geometry-aware statistical averaging scheme for probability distributions, yielding both consistency guarantees and explicit computational strategies for empirical estimation, and generalizes fundamentally the notion of averaging in non-Euclidean spaces.