Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
175 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Stereographic Spherical Sliced Wasserstein Distances (2402.02345v2)

Published 4 Feb 2024 in cs.LG, cs.AI, cs.CV, and stat.ML

Abstract: Comparing spherical probability distributions is of great interest in various fields, including geology, medical domains, computer vision, and deep representation learning. The utility of optimal transport-based distances, such as the Wasserstein distance, for comparing probability measures has spurred active research in developing computationally efficient variations of these distances for spherical probability measures. This paper introduces a high-speed and highly parallelizable distance for comparing spherical measures using the stereographic projection and the generalized Radon transform, which we refer to as the Stereographic Spherical Sliced Wasserstein (S3W) distance. We carefully address the distance distortion caused by the stereographic projection and provide an extensive theoretical analysis of our proposed metric and its rotationally invariant variation. Finally, we evaluate the performance of the proposed metrics and compare them with recent baselines in terms of both speed and accuracy through a wide range of numerical studies, including gradient flows and self-supervised learning. Our code is available at https://github.com/mint-vu/s3wd.

Citations (1)

Summary

  • The paper introduces novel S3W and RI-S3W distances that leverage stereographic projection and an injective map to compare spherical probability measures efficiently.
  • It integrates the Radon transform and its extensions to quantify Wasserstein distances on spheres, outperforming traditional methods in gradient flows and autoencoder frameworks.
  • Experimental evaluations demonstrate improved performance in self-supervised learning, generative modeling, and density estimation, offering faster computation and greater accuracy.

Stereographic Spherical Sliced Wasserstein Distances

The paper "Stereographic Spherical Sliced Wasserstein Distances" proposes novel distance metrics for comparing spherical probability measures via optimal transport methods. The new metrics are termed Stereographic Spherical Sliced Wasserstein (S3W) distance and its rotationally invariant variant (RI-S3W). The primary motivation is to improve computational efficiency and accuracy when comparing distributions defined on a hypersphere, which arise in various scientific and engineering applications, such as deep representation learning, computer vision, and medical imaging.

Methodological Contributions and Technical Highlights

Stereographic Projection and Radon Transform

The core idea involves leveraging the stereographic projection to map points from the sphere onto a plane. This projection is conformal, preserving local angles but not distances. To address the distortion of distances, the authors augment the stereographic projection with an injective map hh, ensuring minimal distortion in the embedded space. The classic Radon transform and its variants, the Generalized Radon Transform (GRT) and an injective function-based GRT extension, are then applied to the projected measures.

Stereographic Spherical Sliced Wasserstein Distance

The S3W distance is introduced through the following steps:

  1. Stereographic Projection (SP): Points on the sphere Sd\mathbb{S}^d are mapped to Rd\mathbb{R}^d using SP.
  2. Injective Map: An injective map hh is applied to manage the distortion from SP.
  3. Radon Transform: The Radon transform or its generalized variant, G\mathcal{G}, is applied to slices taken from the transformed distribution.
  4. Sliced Wasserstein Metric: High-dimensional distributions are compared using the pp-Wasserstein distance between their one-dimensional projections.

Rotationally Invariant Spherical Sliced Wasserstein Distance

To enhance robustness, a rotationally invariant version of S3W (RI-S3W) is proposed. This involves averaging the distances over multiple random rotations from SO(d+1)\mathrm{SO}(d+1). This rotational averaging ensures that the distance metric remains invariant under spherical rotations, making it suitable for applications where orientation should not affect the comparison.

Experimental Evaluation

Gradient Flows on Spheres

The paper demonstrates the utility of the proposed distances in gradient flow problems, where the goal is to match a complex target distribution composed of multiple von Mises-Fisher distributions. Results show that S3W and RI-S3W significantly outperform existing methods like the Spherical Sliced Wasserstein (SSWSSW) in terms of runtime and sometimes, final performance.

Self-Supervised Learning (SSL)

S3W-based uniformity loss is evaluated in a self-supervised learning framework for representation learning on the CIFAR-10 dataset. The results indicate that models trained with S3W and its variants provide competitive performance while offering reduced computational costs.

Sliced-Wasserstein Autoencoders (SWAE)

The S3W metrics are further tested in the context of SWAEs for generative modeling. Experiments on MNIST and CIFAR-10 indicate that S3W-based SWAEs achieve comparable or better performance concerning reconstruction quality and latent space regularity, while being computationally more efficient.

Density Estimation on the Sphere

The paper also explores density estimation using normalizing flows for spherical data like Earthquake and Fire datasets. The S3W metrics are shown to yield more accurate density estimates with faster training times compared to existing baselines.

Theoretical and Practical Implications

From a theoretical perspective, the introduction of stereographic projection combined with an injective map hh to manage distortion represents a significant advancement. It offers a systematic way to ensure that the pp-Wasserstein distance in the projected space closely approximates the geodesic distance on the sphere. Practically, the high computational efficiency and parallelizability of the proposed distances make them particularly appealing for large-scale applications, including deep learning and computer vision, where spherical data representations are prevalent.

Speculatively, future developments might include extending these techniques to other Riemannian manifolds or exploring their applicability in more diverse domains like astrophysics or climate modeling where spherical data structures are common.

In conclusion, the paper makes robust and pragmatic contributions to the field of optimal transport and spherical statistics. The proposed S3W and RI-S3W distances offer a blend of theoretical rigor and practical efficiency, promising broad applicability and setting a new standard in various scientific and engineering domains involving spherical data.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets