Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
GPT-5.1
GPT-5.1 114 tok/s
Gemini 3.0 Pro 53 tok/s Pro
Gemini 2.5 Flash 132 tok/s Pro
Kimi K2 176 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

Double-Sliced Wasserstein Metric

Updated 11 November 2025
  • Double-Sliced Wasserstein is a metric that compares probability meta-measures using two sequential slicing operations, preserving the topology of the original Wasserstein-over-Wasserstein distance.
  • It combines Euclidean projections and quantile-space slicing to achieve computational efficiency and numerical robustness in high-dimensional data analysis.
  • Empirical evaluations demonstrate that DSW provides comparable discriminative power to WoW while accelerating computation and reducing sensitivity to unstable high-order moment estimation.

The Double-Sliced Wasserstein (DSW) metric is a recent development in the paper of optimal transport on spaces of probability measures, specifically designed as a computationally efficient and statistically robust surrogate for the Wasserstein-over-Wasserstein (WoW) distance between meta-measures. The DSW metric achieves speed and stability by combining traditional Euclidean slicing with an inner slicing in quantile function space, avoiding reliance on high-order moments or unstable operations. DSW is topologically equivalent to WoW on empirical meta-measures and empirically offers substantial speedups with comparable discriminative power for applications in dataset similarity, point-cloud analysis, and perceptual evaluation of images and shapes (Piening et al., 26 Sep 2025).

1. Meta-Measure Spaces and the Wasserstein-Over-Wasserstein Problem

Let X\mathcal{X} be a Polish space and P2(X)P_2(\mathcal{X}) the set of Borel probability measures with finite second moment, equipped with the 2-Wasserstein distance,

W2(μ,ν)=(infπΓ(μ,ν)X2d2(x,x)dπ(x,x))1/2.W_2(\mu,\nu) = \left(\inf_{\pi\in\Gamma(\mu,\nu)} \int_{\mathcal{X}^2} d^2(x,x')\,d\pi(x,x')\right)^{1/2}.

A meta-measure is defined as αP2(P2(X))\alpha\in P_2\bigl(P_2(\mathcal{X})\bigr), that is, a probability law over probability measures on X\mathcal{X}. The Wasserstein-over-Wasserstein (WoW) metric lifts the W2W_2 distance to the meta-measure space: WoW(α,β)=[infΠΓ(α,β)P2(X)×P2(X)W22(μ,ν)dΠ(μ,ν)]1/2,\mathrm{WoW}(\alpha, \beta) = \left[ \inf_{\Pi \in \Gamma(\alpha,\beta)} \int_{P_2(\mathcal{X}) \times P_2(\mathcal{X})} W_2^2(\mu,\nu)\, d\Pi(\mu,\nu) \right]^{1/2}, which is computationally prohibitive for large collections of distributions, especially in high-dimensions due to quadratic scaling in the number of inner measures.

2. Quantile Isometry and Functional Slicing

For measures on R\mathbb{R}, the 1D 2-Wasserstein metric admits an isometry to L2([0,1])L^2([0,1]), mapping a measure μ\mu to its quantile function Qμ(s)=inf{x:μ(,x]s}Q_\mu(s)= \inf\{x:\mu(-\infty,x]\geq s\}: W2(μ,ν;R)=[01Qμ(s)Qν(s)2ds]1/2=QμQνL2([0,1]).W_2(\mu,\nu;\mathbb{R}) = \left[ \int_0^1 |Q_\mu(s)-Q_\nu(s)|^2 ds \right]^{1/2} = \|Q_\mu-Q_\nu\|_{L^2([0,1])}. This isometry underpins the functional optimal transport approach used in DSW. Sliced-Wasserstein distances on general Banach spaces UU make use of projections πv(x)=v,x\pi_v(x)=\langle v, x\rangle for vUv\in U^*, and for a probability measure ξ\xi on UU^*,

SW(μ,ν;ξ)=[vUW22(πv#μ,πv#ν;R)dξ(v)]1/2.SW(\mu,\nu; \xi) = \left[ \int_{v \in U^*} W_2^2(\pi_{v\#}\mu, \pi_{v\#}\nu; \mathbb{R})\, d\xi(v) \right]^{1/2}.

This construction, under appropriate support conditions on ξ\xi, yields a true metric on P2(U)P_2(U).

In the specific setting of meta-measures on P2(R)P_2(\mathbb{R}), the quantile map q:μQμq:\mu\to Q_\mu pushes α\alpha to a law q#αq_\#\alpha on L2([0,1])L^2([0,1]), yielding a “sliced-quantile WoW” (SQW) metric,

SQW(α,β;ξ)=SW(q#α,q#β;ξ).SQW(\alpha, \beta; \xi) = SW(q_\#\alpha, q_\#\beta; \xi).

3. Construction and Mathematical Formulation of Double-Sliced Wasserstein

The Double-Sliced Wasserstein metric is constructed through consecutive application of two slicing steps:

  1. Euclidean Slicing: For each θSd1\theta \in S^{d-1}, project every inner measure μP2(Rd)\mu \in P_2(\mathbb{R}^d) onto R\mathbb{R} via πθ(x)=θ,x\pi_\theta(x) = \langle \theta, x \rangle, inducing a pushed-forward measure πθ#μ\pi_{\theta\#}\mu.
  2. Quantile-Space Slicing: For fixed θ\theta, one obtains two 1D meta-measures $_{\theta\#}\alpha,\,_{\theta\#}\beta \in P_2(P_2(\mathbb{R}))$. Using a Gaussian process prior ξ\xi on L2([0,1])L^2([0,1]) (e.g., with an RBF kernel), the SQW distance between the meta-measures is

$SW(_{\theta\#}\alpha,\,_{\theta\#}\beta; \xi) = \left[ \int_{v \in L^2([0,1])} W_2^2\left(\pi_{v\#}q_\#(_{\theta\#}\alpha),\, \pi_{v\#}q_\#(_{\theta\#}\beta) \right) d\xi(v) \right]^{1/2}.$

  1. Aggregation: Integrate the inner SQW metric over Sd1S^{d-1} to obtain the Double-Sliced Wasserstein: $DSW(\alpha, \beta; \xi) = \left[ \int_{S^{d-1}} SW^2(_{\theta\#}\alpha,\,_{\theta\#}\beta; \xi) dS^{d-1}(\theta) \right]^{1/2}.$

The full expansion writes: DSW(α,β)={Sd1vL2([0,1])u[0,1]QPθ#α(u)QPθ#β(u),v(u)2dudξ(v)dSd1(θ)}1/2.DSW(\alpha, \beta) = \left\{ \int_{S^{d-1}} \int_{v \in L^2([0,1])} \int_{u \in [0,1]} \left\langle Q_{P_\theta\#\alpha}(u) - Q_{P_\theta\#\beta}(u), v(u) \right\rangle^2 du\, d\xi(v)\, dS^{d-1}(\theta) \right\}^{1/2}.

For computation, inner integrals are estimated using Monte Carlo samples θsSd1\theta_s \in S^{d-1} and vs,tv_{s,t} Gaussian process paths.

4. Topological Properties and Equivalence with WoW

Let empirical meta-measures αn,βnPN(Pn~(Rd))\alpha_n, \beta_n \in P^N(P^{\tilde n}(\mathbb{R}^d)) be composed of NN inner empirical measures, each with n~\tilde n support points. The DSW metric is topologically equivalent to the WoW metric: DSW(αn,βn;ξ)0WoW(αn,βn)0DSW(\alpha_n, \beta_n; \xi) \to 0 \quad\Longleftrightarrow\quad \mathrm{WoW}(\alpha_n, \beta_n) \to 0 for any positive Gaussian ξ\xi (Piening et al., 26 Sep 2025). The argument combines a discrete Cramér–Wold theorem at each slice θ\theta and the quantile-space isometry. This ensures that DSW is a true metric and it preserves the geometry induced by WoW on the space of empirical meta-measures.

5. Computational Complexity and Numerical Stability

Metric Complexity per evaluation Stability Considerations
WoW O(N2n2logn)\mathcal{O}(N^2 n^2\log n) Requires all pairwise inner 2-Wasserstein computations; slow for large NN and nn; sensitive to moment estimation
DSW O(SNnlogn)\mathcal{O}(S N n\log n) Only SN2S \ll N^2 projections needed; relies on quantile functions (no high-order moments); numerically robust
  • For full WoW on NN meta-points of size nn, cost is dominated by an N×NN\times N matrix of pairwise 2-Wasserstein computations, each in O(n2logn)O(n^2\log n) (entropic case).
  • For DSW, sampling SS directions and computing 1D quantile-transport per meta-measure, total complexity is O(SNnlogn)O(S N n \log n); usually S=O(N)S = O(N) or constant.
  • DSW avoids the unstable high-order moments used in s-OTDD, maintaining stability even with non-Gaussian or heavy-tailed meta-distributions.

6. Empirical Results and Applications

Experimental evaluations in (Piening et al., 26 Sep 2025) demonstrate that DSW achieves strong performance in several tasks:

  • Shape classification via local distance distributions (mm-spaces): DSW matches accuracy of Gromov–Wasserstein and sliced GW (STLB), with an order of magnitude faster runtime.
  • Dataset similarity (OTDD surrogate): On MNIST, Fashion-MNIST, and CIFAR-10 splits, DSW correlates with exact OTDD (Pearson >0.9> 0.9), outperforming s-OTDD in stability and speed.
  • Point-cloud evaluation: For batches of 3D shapes modeled as meta-measures in P(P(R3))P(P(\mathbb{R}^3)), DSW matches OT-NNA and WoW in sensitivity but is 10–20×\times faster, providing similar robustness to mode collapse and sampling noise.
  • Image perceptual distance: Image batches are represented as meta-measures on patch distributions. DSW defines a perceptual metric sensitive to qualitative similarity, aligns with standard fiducial metrics (e.g., Kernel Inception Distance), and is 40×\times faster than full WoW.

These results indicate that DSW yields operationally efficient metrics for meta-measure comparison without the compromises of parametric forms or unstable statistical estimators.

7. Significance and Prospects

Double-Sliced Wasserstein provides a tractable, mathematically principled metric for meta-level optimal transport problems, preserving the topology and discriminative power of WoW while mitigating prohibitive computational demands. Its combination of classical slicing and functional quantile-space slicing leverages both geometry and statistical properties of optimal transport. The approach is widely applicable to large-scale shape analysis, dataset comparison, and the evaluation of structured or hierarchical data distributions.

Pending open questions include: optimizing DSW kernel choices for task-adaptiveness, theoretical dual formulations for functional slicing, and extensions beyond empirical meta-measures to infinite or continuous families. A plausible implication is that DSW could serve as a foundation for scalable learning frameworks in high-level data spaces where conventional OT remains intractable, especially in large-dimensional and nonparametric distributional regimes (Piening et al., 26 Sep 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)
Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Double-sliced Wasserstein (DSW).