Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
GPT-5.1
GPT-5.1 114 tok/s
Gemini 3.0 Pro 53 tok/s Pro
Gemini 2.5 Flash 132 tok/s Pro
Kimi K2 176 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

Stream-SW: Streaming Sliced Wasserstein

Updated 14 November 2025
  • Streaming Sliced Wasserstein (Stream-SW) is a method that leverages quantile sketches and randomized projections to compute the sliced Wasserstein distance on streaming data.
  • It reduces high-dimensional optimal transport problems to a series of one-dimensional quantile queries, enabling single-pass processing with fixed memory.
  • The framework offers theoretical error bounds, improved convergence, and empirical advantages over random subsampling in diverse applications.

Streaming Sliced Wasserstein (Stream-SW) is a computational framework for estimating the sliced Wasserstein (SW) distance between probability distributions when samples arrive in a streaming fashion. It builds on quantile sketching techniques for 1D Wasserstein computation and extends them via randomized projections to provide a memory-efficient, single-pass algorithm for high-dimensional optimal transport problems. Stream-SW offers theoretical guarantees on accuracy and resource consumption, and demonstrates marked advantages over random subsampling approaches in a variety of empirical settings.

1. Streaming 1D-Wasserstein Distance via Quantile Sketches

Sliced Wasserstein methods reduce a dd-dimensional optimal transport problem to a collection of one-dimensional projection problems. The 1D pp-Wasserstein distance between empirical measures

μn=1ni=1nδxi,νm=1mj=1mδyj\mu_n = \frac{1}{n}\sum_{i=1}^n \delta_{x_i}, \quad \nu_m = \frac{1}{m}\sum_{j=1}^m \delta_{y_j}

admits the closed form

Wpp(μn,νm)=01Fμn1(q)Fνm1(q)pdq,W_p^p(\mu_n, \nu_m) = \int_0^1 |F^{-1}_{\mu_n}(q) - F^{-1}_{\nu_m}(q)|^p\, dq,

where Fμn1F^{-1}_{\mu_n} denotes the quantile function.

In a streaming context, storage of all samples is not feasible. Instead, Stream-SW maintains a quantile sketch Sμn,kS_{\mu_n,k} of fixed size kk, supporting approximate quantile queries Q(q;S)Q(q; S) such that Q(q;S)Fμ^1(q)ϵnC|Q(q; S)-F^{-1}_{\hat\mu}(q)| \le \epsilon n C, with CC the maximal sample gap. The KKL-sketch of Karnin–Lang–Liberty supports one-pass updates with O((1/ϵ)log(1/δ)+log(n/k))O((1/\epsilon)\sqrt{\log(1/\delta)} + \log(n/k)) memory.

Given two such sketches, the streaming 1D-Wasserstein estimator is

W~pp(μn,νm;Sμn,k1,Sνm,k2)=01Q(q;Sμn,k1)Q(q;Sνm,k2)pdq.\widetilde W_p^p(\mu_n, \nu_m; S_{\mu_n,k_1}, S_{\nu_m,k_2}) = \int_0^1 |Q(q; S_{\mu_n, k_1}) - Q(q; S_{\nu_m, k_2})|^p dq.

For p=1p=1, this reduces to an integral over the absolute difference of quantile queries. Sketches are incrementally updated in O(log(n/k))O(\log(n/k)) amortized time per sample.

2. Streaming Sliced Wasserstein Algorithm

The sliced Wasserstein distance of order pp for dd-dimensional measures μ\mu and ν\nu is

SWpp(μ,ν)=EθU(Sd1)[Wpp(θshμ,θshν)],SW_p^p(\mu, \nu) = \mathbb{E}_{\theta \sim \mathcal{U}(\mathbb{S}^{d-1})} [W_p^p(\theta \sh \mu,\, \theta \sh \nu)],

where Sd1\mathbb{S}^{d-1} is the unit sphere and θshμ\theta \sh \mu is the projection of μ\mu onto direction θ\theta.

Streaming Sliced Wasserstein (Stream-SW) replaces each WpW_p with W~p\widetilde W_p from quantile sketches. The algorithm proceeds as follows:

  • Select LL projection directions θ1,...,θLSd1\theta_1, ..., \theta_L \in \mathbb{S}^{d-1}.
  • For each direction =1,...,L\ell=1,...,L and each incoming xix_i from μ\mu's stream, compute θxi\theta_\ell^\top x_i and update Sθshμ,k1S_{\theta_\ell \sh \mu, k_1}. Similarly for ν\nu.
  • At any point, estimate the SW distance by

SW~pp^(μn,νm;k1,k2,L)=1L=1L01Q(q;Sθshμ,k1)Q(q;Sθshν,k2)pdq.\widehat{\widetilde{SW}_p^p}(\mu_n, \nu_m; k_1, k_2, L) = \frac{1}{L} \sum_{\ell=1}^L \int_0^1 |Q(q; S_{\theta_\ell \sh \mu, k_1}) - Q(q; S_{\theta_\ell \sh \nu, k_2})|^p dq.

Stream-SW thereby enables a single-pass, memory-bounded estimate of the SW distance at any time.

3. Theoretical Guarantees and Complexity

Stream-SW provides explicit, nonasymptotic bounds on both memory usage and approximation error:

  • Streaming 1DW error: If the supports have diameter RR and sketch precisions ϵ1,ϵ2\epsilon_1,\epsilon_2, the error satisfies

W~pp(μn,νm)Wpp(μn,νm)pRp1(ϵ1nC1+ϵ2mC2).|\widetilde W_p^p(\mu_n, \nu_m) - W_p^p(\mu_n, \nu_m)| \le p R^{p-1} (\epsilon_1 n C_1 + \epsilon_2 m C_2).

  • SW population-level error: For i.i.d. samples from μ,ν\mu, \nu, the expected error in SW is

ESW~pp(μn,νm)SWpp(μ,ν)Cp,R(α(d,n,m)+ϵ1n+ϵ2m)\mathbb{E} |\widetilde{SW}_p^p(\mu_n, \nu_m) - SW_p^p(\mu, \nu)| \le C_{p,R} \left(\alpha(d, n, m) + \epsilon_1 n + \epsilon_2 m\right)

with α(d,n,m)=d+1(lognn+logmm)\alpha(d,n,m) = \sqrt{d+1}\left(\sqrt{\frac{\log n}{n}} + \sqrt{\frac{\log m}{m}}\right).

  • Monte Carlo error in LL projections:

ESW~pp^SW~ppVar1/2[W~pp(θshμn,θshνm)]L.\mathbb{E}|\widehat{\widetilde{SW}_p^p} - \widetilde{SW}_p^p| \le \frac{\operatorname{Var}^{1/2}[\,\widetilde{W}_p^p(\theta \sh \mu_n, \theta \sh \nu_m)\,]}{\sqrt L}.

  • Memory and computational complexity: Each KKL-sketch uses k=O((1/ϵ)log(1/δ))k=O((1/\epsilon)\sqrt{\log(1/\delta)}) space. With $2L$ sketches and direction storage, total space is O(Lk+Llog(n/k)+Ld)O(Lk + L\log(n/k) + Ld); per-sample update is O(Llog(n/k)+Ld)O(L\log(n/k) + Ld).

Stream-SW thus achieves O(n1/2+k1+L1/2)O(n^{-1/2} + k^{-1} + L^{-1/2}) error rates with memory and time scalable in LL and kk.

4. Proof Outline of Approximation Bounds

The approximation guarantees derive from three elements:

  • Quantile approximation: The sketch replaces the exact quantile function; a Taylor or Hölder expansion delivers a pointwise error proportional to the sketch precision and window width (O(ϵ1nC1+ϵ2mC2)O(\epsilon_1 n C_1 + \epsilon_2 m C_2)).
  • Projection averaging: The error is averaged over random θ\theta, yielding the same order bound for the SW aggregate.
  • Sampling error decomposition: The estimator's discrepancy with population SW separates into (i) sketching error and (ii) statistical sampling error, with the latter analyzed via empirical process (VC) bounds over half-spaces {x:θxz}\{x : \theta^\top x \leq z\}.
  • Monte Carlo in LL: Standard MC variance bound applies for the projection average, yielding O(1/L)O(1/\sqrt L) behavior.

5. Empirical Evaluation

Stream-SW demonstrates favorable empirical properties across several domains:

Task Key Finding Comparison
Mixtures of Gaussians 2×–10× lower error than subsample-SW at same kk 10×–100× fewer points retained
Point-cloud classification Stream-SW(k=20k=20) achieves 76.3–77.7% vs full SW 77.3–77.7% accuracy; subsample-SW (k=20k=20): 67.7–68.0% ModelNet10, KNN (K=5K=5)
Gradient flows Faster convergence: W2226.9W_2^2\approx 26.9 vs subsample-SW 31.1 at step 1000; only Stream-SW converges in 5000 steps Euler–Maruyama, 1000 particles
Change-point detection Detection delay reduced to 10–32 frames (SW sliding window: 49–100) MSRC-12 Kinect

Stream-SW thus attains higher accuracy or faster convergence at fixed memory compared to uniform random subsampling approaches, particularly under severe memory bottlenecks.

6. Implementation, Tuning, and Extensions

Key practical considerations when deploying Stream-SW include:

  • Choosing number of projections LL and sketch size kk: Increasing LL reduces MC error as O(1/L)O(1/\sqrt L), whereas increasing kk strengthens quantile accuracy as O(1/k)O(1/k). Total memory and per-sample update cost scale linearly with LL and logarithmically with n/kn/k.
  • Projection schemes: Replace standard MC by quasi-MC sequences (e.g., Sobol, Halton) or optimized directions for sphere integration, as in quasi-Monte Carlo for 3D SW.
  • Handling asymmetric streams: In scenarios where only one distribution is streaming, maintain the sketch only for the streaming distribution and compare on-the-fly to the fixed other.
  • Extensions: The approach adapts to generalized sliced OT, spherical or manifold-projected SW, and partial-SW, by substituting the appropriate 1D streaming OT solver.

In sum, Stream-SW is the first single-pass, low-memory methodology for sliced Wasserstein distance estimation from sample streams, with rigorous finite-sample and memory–error guarantees, and empirically outperforms random subsampling algorithms under tight resource constraints (Nguyen, 11 May 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)
Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Streaming Sliced Wasserstein (Stream-SW).