Papers
Topics
Authors
Recent
Search
2000 character limit reached

Wasserstein Barycenter Fusion

Updated 21 April 2026
  • Wasserstein barycenter fusion is a method that aggregates multiple probability distributions using the Wasserstein metric to compute a geometric mean.
  • It employs a nonconvex–concave minimax formulation with the WDHA algorithm, integrating Wasserstein descent and Sobolev ascent for precise fusion.
  • The approach outperforms entropic methods with faster runtimes and sharper results, making it suitable for large-scale multi-sensor and high-resolution image fusion.

Wasserstein barycenter fusion refers to the process of aggregating multiple probability distributions into a single representative distribution—the Wasserstein barycenter—by optimizing a mean in the space of probability measures endowed with the Wasserstein metric. This operation preserves geometric and spatial structure, providing a non-linear notion of “averaging” suitable for both continuous and discrete distributions in high dimensions. Recent developments emphasize nearly linear-time computation without entropic blurring, strong theoretical guarantees, and direct application to large-scale multi-modal or multi-sensor fusion tasks (Kim et al., 24 Jan 2025).

1. Mathematical Formulation and Variational Principle

Given input probability vectors μiRm\mu_i\in\mathbb{R}^m (histograms with μij0\mu_i^j\ge0 and jμij=1\sum_j\mu_i^j=1) on a common support, the unregularized discrete Wasserstein-2 barycenter is defined as

νˉ=argminνΔm1ni=1nW22(ν,μi)\bar\nu = \arg\min_{\nu \in \Delta_m} \frac{1}{n} \sum_{i=1}^n W_2^2(\nu, \mu_i)

where Δm\Delta_m is the probability simplex, and W22(ν,μi)W_2^2(\nu, \mu_i) is the squared 2-Wasserstein distance. Kantorovich duality yields the equivalent minimax variational form: minνΔmmaxφ1,,φnConv1ni=1n{j=1m(12xj2φij)νj+k=1m(12xk2(φi)k)μik}\min_{\nu\in\Delta_m}\, \max_{\varphi_1,\dots,\varphi_n \in \text{Conv}} \frac{1}{n} \sum_{i=1}^n \left\{ \sum_{j=1}^m \left( \frac{1}{2}\|x_j\|^2 - \varphi_i^j \right) \nu^j + \sum_{k=1}^m \left( \frac{1}{2}\|x_k\|^2 - (\varphi_i^*)^k \right)\mu_i^k \right\} where φi\varphi_i are Kantorovich dual potentials, φi\varphi_i^* denotes the convex conjugate, and “Conv” is the cone of convex vectors (Kim et al., 24 Jan 2025).

2. Nonconvex–Concave Saddle Point Reformulation

Defining the block-separable objective: J(ν,Φ)=1ni=1nIνμi(φi),Iνμ(φ)=j=1m(12xj2φj)νj+k=1m(12xk2φk)μkJ(\nu, \Phi) = \frac{1}{n} \sum_{i=1}^n I^{\mu_i}_\nu(\varphi_i), \quad I^\mu_\nu(\varphi) = \sum_{j=1}^m \left( \tfrac{1}{2} \|x_j\|^2 - \varphi_j \right) \nu_j + \sum_{k=1}^m \left( \tfrac{1}{2} \|x_k\|^2 - \varphi^*_k \right) \mu_k the barycenter problem reduces to

μij0\mu_i^j\ge00

This objective is nonconvex in μij0\mu_i^j\ge01 (geodesically convex in general) but, crucially, is concave in each block μij0\mu_i^j\ge02.

3. The Wasserstein-Descent Homogeneous Sobolev-Ascent (WDHA) Algorithm

WDHA alternates:

  • Primal descent in the Wasserstein (μij0\mu_i^j\ge03) geometry (on the barycenter μij0\mu_i^j\ge04)
  • Dual ascent in the homogeneous Sobolev (μij0\mu_i^j\ge05) geometry (on potentials μij0\mu_i^j\ge06)

Pseudocode:

input: {μ_i}ⁿ_{i=1} on m‐point grid; init ν⁰∈Δ_m, φ_i⁰∈Conv
for t=0…T−1 do
    for i=1…n do
        -- Sobolev ascent (dual update) --
        \hatφ_i ← φ_i^t + η·∇_{Ḣ¹} I^{μ_i}_{ν^t}(φ_i^t)
        φ_i^{t+1} ← projection onto Conv(\hatφ_i)
    end
    -- Wasserstein descent (primal update) --
    \barφ ← (1/n)\sum_i φ_i^{t+1}
    ν^{t+1} ←  (id – τ·(id – ∇\barφ))_# ν^t
end
output: ν^T, {φ_i^T}

The key gradients are:

  • Dual μij0\mu_i^j\ge07: μij0\mu_i^j\ge08
  • Primal μij0\mu_i^j\ge09: jμij=1\sum_j\mu_i^j=10 (Kim et al., 24 Jan 2025).

4. Convergence and Complexity

Under suitable density boundedness and step size constraints (specific thresholds given in (Kim et al., 24 Jan 2025)), the squared Wasserstein gradient norm decays as

jμij=1\sum_j\mu_i^j=11

ensuring finding an jμij=1\sum_j\mu_i^j=12-stationary point in jμij=1\sum_j\mu_i^j=13 iterations.

Computational complexity per WDHA iteration:

Operation Time Complexity Space Complexity
Dual block update jμij=1\sum_j\mu_i^j=14 jμij=1\sum_j\mu_i^j=15
Primal update jμij=1\sum_j\mu_i^j=16 jμij=1\sum_j\mu_i^j=17
LP for OT map jμij=1\sum_j\mu_i^j=18 jμij=1\sum_j\mu_i^j=19
Sinkhorn-type νˉ=argminνΔm1ni=1nW22(ν,μi)\bar\nu = \arg\min_{\nu \in \Delta_m} \frac{1}{n} \sum_{i=1}^n W_2^2(\nu, \mu_i)0 (for accuracy νˉ=argminνΔm1ni=1nW22(ν,μi)\bar\nu = \arg\min_{\nu \in \Delta_m} \frac{1}{n} \sum_{i=1}^n W_2^2(\nu, \mu_i)1) --

Efficient WDHA implementation leverages fast Legendre transforms and FFT-based Poisson solvers.

5. Multi-Modal and Multi-Sensor Fusion Applications

Each data modality or sensor provides a discrete distribution νˉ=argminνΔm1ni=1nW22(ν,μi)\bar\nu = \arg\min_{\nu \in \Delta_m} \frac{1}{n} \sum_{i=1}^n W_2^2(\nu, \mu_i)2 on a common spatial grid. WDHA fuses these to a geometric barycenter νˉ=argminνΔm1ni=1nW22(ν,μi)\bar\nu = \arg\min_{\nu \in \Delta_m} \frac{1}{n} \sum_{i=1}^n W_2^2(\nu, \mu_i)3 that captures the central "arithmetic mean" in Wasserstein geometry—without introducing entropic blur, unlike regularized Sinkhorn solvers.

Practical points:

  • Algorithmically scalable (per-iteration νˉ=argminνΔm1ni=1nW22(ν,μi)\bar\nu = \arg\min_{\nu \in \Delta_m} \frac{1}{n} \sum_{i=1}^n W_2^2(\nu, \mu_i)4), feasible on GPUs for grids up to νˉ=argminνΔm1ni=1nW22(ν,μi)\bar\nu = \arg\min_{\nu \in \Delta_m} \frac{1}{n} \sum_{i=1}^n W_2^2(\nu, \mu_i)5.
  • Leveraging separability for efficient 1D row/column transforms.
  • Optional projection by convex double conjugation for dual supports (Kim et al., 24 Jan 2025).

6. Empirical Results: Accuracy and Runtime Advantages

WDHA yields sharper and more accurate unregularized barycenters than entropic-regularized (Sinkhorn-type) approaches, with significantly lower runtime:

Experiment WDHA (iter/time/cost) CWB (entropic, time/cost) DSB (entropic, time/cost)
4 shapes νˉ=argminνΔm1ni=1nW22(ν,μi)\bar\nu = \arg\min_{\nu \in \Delta_m} \frac{1}{n} \sum_{i=1}^n W_2^2(\nu, \mu_i)6 300 / 676s / 74.58e-3 3731s / 75.07e-3 (blurred) 7249s / 74.58e-3 (blurry)
Handwritten "8" νˉ=argminνΔm1ni=1nW22(ν,μi)\bar\nu = \arg\min_{\nu \in \Delta_m} \frac{1}{n} \sum_{i=1}^n W_2^2(\nu, \mu_i)7 ~3300s (sharp) ~10800s (blur) ~11200s (blur)

WDHA, marrying Wasserstein primal descent and Sobolev dual ascent, outperforms entropic-regularized algorithms in both sharpness of the barycenter (no blurring) and wall-clock time, particularly at high resolution (Kim et al., 24 Jan 2025).

7. Theoretical Guarantees and Practical Recommendations

  • The WDHA minimax optimization (nonconvex in νˉ=argminνΔm1ni=1nW22(ν,μi)\bar\nu = \arg\min_{\nu \in \Delta_m} \frac{1}{n} \sum_{i=1}^n W_2^2(\nu, \mu_i)8, concave in dual blocks) admits well-posedness under boundedness and geometric convexity assumptions.
  • Convergence rate: gradient norm decays as νˉ=argminνΔm1ni=1nW22(ν,μi)\bar\nu = \arg\min_{\nu \in \Delta_m} \frac{1}{n} \sum_{i=1}^n W_2^2(\nu, \mu_i)9, ensuring linear rate to stationarity.
  • Memory and runtime per iteration scale as Δm\Delta_m0 and Δm\Delta_m1 respectively, with linear memory even for high-dimensional barycenter fusion.
  • WDHA is recommended for scenarios requiring sharp barycentric fusion at scale (e.g., sensor networks, high-res image morphing), where entropic regularization would induce excessive bias or blur.

References

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Wasserstein Barycenter Fusion.