Entropy-Regularized Barycenter
- Entropy-regularized barycenter is a probability measure that minimizes a weighted sum of entropy-regularized Wasserstein distances, ensuring a unique and stable solution.
- It is characterized by a robust variational formulation and coupled PDE (Monge–Ampère) system that guarantees strong regularity, moment bounds, and statistical properties.
- Efficient computational methods like Sinkhorn-type and decentralized algorithms enable rapid convergence, making it applicable in data registration, sensor fusion, and robust classification.
An entropy-regularized barycenter is a probability measure that arises as the unique solution to an optimization problem that averages a finite collection of input measures while incorporating entropy regularization into the optimal transport metric. This framework simultaneously ensures strict convexity, regularizes the geometry, enables computational efficiency through Sinkhorn-type algorithms, and yields strong regularity and statistical properties.
1. Variational Formulation and Existence
Given input measures on (or a compact convex domain), weights with , and entropy regularization parameter , the entropy-regularized barycenter is defined as the unique minimizer of
over all in with finite second moment. Here denotes the squared 2-Wasserstein distance regularized by entropy, i.e.,
0
The entropy functional can be the Boltzmann–Shannon entropy, Tsallis entropy, or more general convex functionals. Under mild regularity (strict convexity of 1, lower semicontinuity), the barycenter problem possesses a unique minimizer (Carlier et al., 2020, Kum et al., 2020). In Gaussian settings, the barycenter remains Gaussian with mean and covariance given by explicit fixed-point characterizations (Mallasto et al., 2020, Kum et al., 2020).
2. Monge–Ampère and PDE Characterization
The entropy-regularized barycenter admits a PDE characterization via a coupled Monge–Ampère system. If 2 on 3 has density 4, and 5 denotes the unique (up to constants) Kantorovich potential transporting 6 to 7, then
8
with 9. Each pair 0 must satisfy the second boundary Monge–Ampère equation
1
This system characterizes the barycenter as the unique solution to a regularized coupled optimal transport problem (Carlier et al., 2020).
3. Regularity and Stability Properties
Entropy regularization induces strong regularity in the barycenter:
- Moment and Sobolev bounds: For convex domains, 2; in particular, 3 in one dimension, 4 for all 5 in 6, and 7 for 8 (Carlier et al., 2020).
- Moment bounds: If 9 for 0, then 1, with explicit estimates depending on 2 (Carlier et al., 2020).
- Higher regularity: If input measures are supported on a 3 domain with 4 densities, then 5 and potentials are 6 diffeomorphisms (Carlier et al., 2020).
- Log-concavity: If each 7 is log-concave, then 8 satisfies Lipschitz and second-derivative bounds; log 9 with explicit spectral bounds (Carlier et al., 2020).
- Stability: The barycenter mapping is Lipschitz-continuous in the input measures in Wasserstein or entropic divergence, robust under noise or perturbations (Kum et al., 2020, Bigot et al., 2018).
4. Computational Methods and Convergence
Sinkhorn-type algorithms exploit the strong convexity from entropy regularization, allowing tractable computation:
- Dual and Fixed Point Methods: In the Gaussian case, the barycenter covariance solves a nonlinear fixed point 0 where 1 uses a closed-form functional of 2 and 3 (Mallasto et al., 2020, Kum et al., 2020).
- Sinkhorn Iterations: In discrete or semi-discrete settings, alternating Bregman projections produce rapid geometric convergence for moderate 4 (Bigot et al., 2018).
- Decentralized Algorithms: For large distributed systems, block coordinate descent on dual variables, possibly asynchronous, achieves convergence under network delays (Zhang et al., 2023).
- Noisy Particle Gradient Descent: In grid-free settings, stochastic particle methods converge in mean-field with exponential rate to the barycenter, solving a nonlinear Fokker–Planck PDE (Chizat, 2023).
- Complexity: Each iteration has polynomial time complexity in the number of support points, avoiding the curse of dimensionality typical of unregularized barycenter algorithms (Bigot et al., 2018, Chizat, 2023).
5. Application and Statistical Properties
The entropic regularized barycenter serves as a geometric averaging tool in a variety of applications:
- Data registration: Used for point cloud and flow cytometry alignment, controlling smoothness of barycenters via 5 (Bigot et al., 2018).
- Sensor fusion: Robust to misalignment and noise, enabling accurate source localization from spatial covariances (Elvander et al., 2018).
- Robust classification: Extraction of barycentric coefficients yields robust features for discrimination under heavy corruption (Mallery et al., 13 Jan 2025).
- Rate-distortion-perception theory: Enables efficient trade-off optimization with strong convergence guarantees and criticality analysis of constraints (Chen et al., 2023, Chen et al., 2024).
- Sample complexity: Rates are dimension-free: convergence in relative entropy is 6 with 7 samples, in contrast to 8 for the unregularized case (Chizat, 2023, Li et al., 4 Feb 2025).
- Central Limit Theorem: The empirical barycenter (from 9 i.i.d.\ samples from the law 0 over measures) satisfies a CLT in 1 with explicit covariance operator determined by the Fréchet derivative at 2 (Carlier et al., 2020).
6. Debiasing and Limit Regimes
The entropic barycenter interpolates between classical 3-Wasserstein barycenter (4) and maximum-entropy average (5):
- As 6, the barycenter converges to the unregularized Wasserstein barycenter, selecting among minimizers those with minimal overall entropy in the transport plans (Mallasto et al., 2020, Chen et al., 2023).
- As 7, the barycenter approaches a degenerate mean (e.g., convex combination of inputs) or kernel-based interpolant ("heat-death" regime) (Mallasto et al., 2020).
- Debiased Barycenters: The so-called doubly regularized barycenter, with matching inner and outer entropy parameters 8, admits 9 bias and mitigates the excess smoothing, with provable minimal distortion to the classical barycenter in isotropic Gaussian scenarios (Chizat, 2023).
7. Special Structures and Extensions
- Multimarginal and Schrödinger Barycenters: The entropy-regularized multimarginal OT formulation admits efficient iterative scaling algorithms and dimension-independent sample complexity. Pushforward of the entropic coupling via a weighted sum yields the multimarginal Schrödinger barycenter, providing statistical optimality and tractability in high dimensions (Li et al., 4 Feb 2025).
- Generalized Entropies: Using Tsallis entropy yields 0-Gaussian barycenters, with well-characterized closed-form solutions for means and covariances. Other convex entropy functionals provide additional flexibility in smoothing and regularization (Kum et al., 2020).
- Non-Euclidean Costs and Manifold Constraints: Energy-guided dual algorithms via energy-based models extend the barycenter paradigm to arbitrary cost functions, including non-Euclidean metrics and constraints to data manifolds (e.g., GAN image spaces), with rigorous duality gaps controlling plan optimality (Kolesov et al., 2023).
The entropy-regularized barycenter is thus a foundational object in modern computational optimal transport, enabling both efficient computations and strong analytical guarantees across a broad range of quantitative domains.