Barycentric Quadratic Cost in Wasserstein Barycenters
- Barycentric quadratic cost is defined as the minimization of a weighted sum of squared Wasserstein-2 distances to compute an optimal average distribution.
- Its equivalent formulations, including two-marginal and multi-marginal approaches, ensure uniqueness and stability through convexity and regularity conditions.
- Efficient algorithms employing fixed-point iterations and quadratic programming deliver robust computation for high-dimensional statistical inference and data science.
The barycentric quadratic cost arises in the definition, computation, and analysis of Wasserstein-2 barycenters, which are average distributions between multiple probability measures under the quadratic cost optimal transport metric. This concept plays a foundational role in statistical inference, geometric measure theory, and computational optimal transport, as well as in modern data science applications that depend on meaningful “interpolations” or “averaging” of distributions. The barycentric quadratic cost provides both the theoretical underpinning and algorithmic structure for algorithms that compute barycenters, such as fixed-point iterations and quadratic programs. Its study involves convex geometry, functional analysis, Riemannian geometry on measure spaces, and numerical computation.
1. Definition and Variational Principles
Given probability measures on with finite second moment and positive weights summing to 1, the quadratic barycentric problem is to find minimizing the weighted sum of squared 2-Wasserstein distances:
where , with the set of couplings with marginals and .
This variational principle directly encodes the barycentric quadratic cost as the functional whose minimizer is the Wasserstein-2 barycenter (Tanguy et al., 2024).
2. Equivalent Formulations and Geometric Structure
The barycentric quadratic cost admits several equivalent formulations central to existence, uniqueness, and computational analysis.
- Two-marginal formulation (C2M): The barycenter minimizes 0 over all 1.
- Multi-marginal formulation (MM): Introduces a coupling 2 on 3,
4
where 5 is the weighted Euclidean barycenter. The pushforward 6 yields the barycenter (Brizzi et al., 2024).
Under standard regularity (e.g., one 7 absolutely continuous), the barycenter is unique, the optimal multi-marginal plan is Monge (graph over the first marginal), and the quadratic cost structure guarantees strict convexity with respect to both measure and coupling (Brizzi et al., 2024).
3. Algorithmic Computation: Fixed-Point and Quadratic Programs
The barycentric quadratic cost structure enables efficient computational schemes:
- Fixed-point iteration: Following the Fréchet mean paradigm, the iterates
8
where 9 are optimal transport maps from 0 to 1, converge under standard hypotheses to the unique barycenter. The energy 2 decreases at every step, and weak limit points satisfy the fixed-point property. Uniqueness ensures full sequence convergence (Tanguy et al., 2024).
- Quadratic program for barycentric coordinates: In the barycentric coding model, determining the weights 3 so that a measure 4 is a barycenter for measures 5 reduces to minimizing
6
where 7, and 8 are optimal maps from 9 to 0. Under appropriate conditions, the unique minimizer yields the true barycentric coordinates (Werenski et al., 2022).
- For Gaussian inputs, the barycentric quadratic cost program reduces to a closed-form QP in terms of means and covariances (Werenski et al., 2022).
4. Statistical, Geometric, and Stability Properties
The barycentric quadratic cost induces fundamental geometric structure:
- Riemannian geometry: The tangent space at 1 in 2 is represented by optimal displacement fields 3; the Gram matrix 4 encodes inner products 5. The cost function 6 measures the squared norm of the barycentric displacement (Werenski et al., 2022).
- Injectivity and stability: The barycenter map 7 is globally invertible (Lipschitz inverse) and the quadratic cost enforces a strong injectivity estimate: 8, with 9, on the support of the optimal plan. This produces absolute continuity and stability of barycenters (Brizzi et al., 2024).
Statistical guarantees are available via sample-based estimators. The quadratic cost enables convergence rates for barycentric coordinate estimation that are dimension- and smoothness-dependent, and the quadratic program is robust under entropic regularization, regularity, and point cloud approximations (Werenski et al., 2022).
5. Extensions: Regularization and General Costs
The barycentric quadratic cost paradigm serves as the baseline for more general transport cost models:
- Entropic regularization: Replaces the cost functional 0 with its 1-regularized counterpart:
2
with convergence and decrease properties preserved due to strict convexity of KL (Tanguy et al., 2024).
- Generic cost functions: The fixed-point method, multi-coupling construction, and corresponding decrease–compactness arguments generalize to arbitrary continuous costs 3 under suitable “ground barycentre” hypotheses (Tanguy et al., 2024).
- Entropic and energy-based algorithmic methods: Modern approaches for approximating quadratic-cost barycenters in high-dimensional or structured domains utilize dual formulations, neural parameterizations (potential functions), stochastic optimization, and MCMC sampling, leveraging the smoothness and geometry provided by the quadratic cost (Kolesov et al., 2023).
6. Applications and Computational Aspects
The barycentric quadratic cost underlies both theoretical and practical algorithms for measure averaging, coding, and estimation:
- Covariance estimation—including closed-form solutions for Gaussian measures—image processing, and natural language processing are highlighted application domains (Werenski et al., 2022).
- Computation: Efficient Sinkhorn-based and energy-guided stochastic gradient methods exploit the quadratic cost structure for scalable high-dimensional barycenter estimation (Kolesov et al., 2023).
- Sparsity and uniqueness: The quadratic cost enforces sparse optimizers (Monge-type plans), crucial for both interpretability and efficiency (Brizzi et al., 2024).
7. Theoretical Guarantees and Open Directions
The barycentric quadratic cost confers the following guarantees in the computation of Wasserstein barycenters:
- Unique barycenter and optimizer under absolute continuity and strict convexity.
- Energy descent and statistical consistency of estimators; sample complexity 4 under empirical optimization (Werenski et al., 2022, Kolesov et al., 2023).
- Universal approximation by neural-network parameterized potentials for the entropic barycenter problem (Kolesov et al., 2023).
- Extension to more general cost functions, with the quadratic case serving as the canonical example by virtue of its analytical tractability and rich geometric interpretation (Tanguy et al., 2024, Brizzi et al., 2024).
The study of barycentric quadratic cost thus forms the backbone of both foundational and algorithmic developments in Wasserstein barycenter computation and its generalizations.