Wasserstein Barycenter Fusion
- Wasserstein barycenter fusion is a method that aggregates multiple probability distributions using the Wasserstein metric to compute a geometric mean.
- It employs a nonconvex–concave minimax formulation with the WDHA algorithm, integrating Wasserstein descent and Sobolev ascent for precise fusion.
- The approach outperforms entropic methods with faster runtimes and sharper results, making it suitable for large-scale multi-sensor and high-resolution image fusion.
Wasserstein barycenter fusion refers to the process of aggregating multiple probability distributions into a single representative distribution—the Wasserstein barycenter—by optimizing a mean in the space of probability measures endowed with the Wasserstein metric. This operation preserves geometric and spatial structure, providing a non-linear notion of “averaging” suitable for both continuous and discrete distributions in high dimensions. Recent developments emphasize nearly linear-time computation without entropic blurring, strong theoretical guarantees, and direct application to large-scale multi-modal or multi-sensor fusion tasks (Kim et al., 24 Jan 2025).
1. Mathematical Formulation and Variational Principle
Given input probability vectors (histograms with and ) on a common support, the unregularized discrete Wasserstein-2 barycenter is defined as
where is the probability simplex, and is the squared 2-Wasserstein distance. Kantorovich duality yields the equivalent minimax variational form: where are Kantorovich dual potentials, denotes the convex conjugate, and “Conv” is the cone of convex vectors (Kim et al., 24 Jan 2025).
2. Nonconvex–Concave Saddle Point Reformulation
Defining the block-separable objective: the barycenter problem reduces to
0
This objective is nonconvex in 1 (geodesically convex in general) but, crucially, is concave in each block 2.
3. The Wasserstein-Descent Homogeneous Sobolev-Ascent (WDHA) Algorithm
WDHA alternates:
- Primal descent in the Wasserstein (3) geometry (on the barycenter 4)
- Dual ascent in the homogeneous Sobolev (5) geometry (on potentials 6)
Pseudocode:
input: {μ_i}ⁿ_{i=1} on m‐point grid; init ν⁰∈Δ_m, φ_i⁰∈Conv
for t=0…T−1 do
for i=1…n do
-- Sobolev ascent (dual update) --
\hatφ_i ← φ_i^t + η·∇_{Ḣ¹} I^{μ_i}_{ν^t}(φ_i^t)
φ_i^{t+1} ← projection onto Conv(\hatφ_i)
end
-- Wasserstein descent (primal update) --
\barφ ← (1/n)\sum_i φ_i^{t+1}
ν^{t+1} ← (id – τ·(id – ∇\barφ))_# ν^t
end
output: ν^T, {φ_i^T}
The key gradients are:
- Dual 7: 8
- Primal 9: 0 (Kim et al., 24 Jan 2025).
4. Convergence and Complexity
Under suitable density boundedness and step size constraints (specific thresholds given in (Kim et al., 24 Jan 2025)), the squared Wasserstein gradient norm decays as
1
ensuring finding an 2-stationary point in 3 iterations.
Computational complexity per WDHA iteration:
| Operation | Time Complexity | Space Complexity |
|---|---|---|
| Dual block update | 4 | 5 |
| Primal update | 6 | 7 |
| LP for OT map | 8 | 9 |
| Sinkhorn-type | 0 (for accuracy 1) | -- |
Efficient WDHA implementation leverages fast Legendre transforms and FFT-based Poisson solvers.
5. Multi-Modal and Multi-Sensor Fusion Applications
Each data modality or sensor provides a discrete distribution 2 on a common spatial grid. WDHA fuses these to a geometric barycenter 3 that captures the central "arithmetic mean" in Wasserstein geometry—without introducing entropic blur, unlike regularized Sinkhorn solvers.
Practical points:
- Algorithmically scalable (per-iteration 4), feasible on GPUs for grids up to 5.
- Leveraging separability for efficient 1D row/column transforms.
- Optional projection by convex double conjugation for dual supports (Kim et al., 24 Jan 2025).
6. Empirical Results: Accuracy and Runtime Advantages
WDHA yields sharper and more accurate unregularized barycenters than entropic-regularized (Sinkhorn-type) approaches, with significantly lower runtime:
| Experiment | WDHA (iter/time/cost) | CWB (entropic, time/cost) | DSB (entropic, time/cost) |
|---|---|---|---|
| 4 shapes 6 | 300 / 676s / 74.58e-3 | 3731s / 75.07e-3 (blurred) | 7249s / 74.58e-3 (blurry) |
| Handwritten "8" 7 | ~3300s (sharp) | ~10800s (blur) | ~11200s (blur) |
WDHA, marrying Wasserstein primal descent and Sobolev dual ascent, outperforms entropic-regularized algorithms in both sharpness of the barycenter (no blurring) and wall-clock time, particularly at high resolution (Kim et al., 24 Jan 2025).
7. Theoretical Guarantees and Practical Recommendations
- The WDHA minimax optimization (nonconvex in 8, concave in dual blocks) admits well-posedness under boundedness and geometric convexity assumptions.
- Convergence rate: gradient norm decays as 9, ensuring linear rate to stationarity.
- Memory and runtime per iteration scale as 0 and 1 respectively, with linear memory even for high-dimensional barycenter fusion.
- WDHA is recommended for scenarios requiring sharp barycentric fusion at scale (e.g., sensor networks, high-res image morphing), where entropic regularization would induce excessive bias or blur.
References
- J. Kim, L. Nurbekyan, G. Peyré, "Optimal Transport Barycenter via Nonconvex-Concave Minimax Optimization", (Kim et al., 24 Jan 2025) (2025) (Kim et al., 24 Jan 2025).