- The paper introduces novel S3W and RI-S3W distances that leverage stereographic projection and an injective map to compare spherical probability measures efficiently.
- It integrates the Radon transform and its extensions to quantify Wasserstein distances on spheres, outperforming traditional methods in gradient flows and autoencoder frameworks.
- Experimental evaluations demonstrate improved performance in self-supervised learning, generative modeling, and density estimation, offering faster computation and greater accuracy.
Stereographic Spherical Sliced Wasserstein Distances
The paper "Stereographic Spherical Sliced Wasserstein Distances" proposes novel distance metrics for comparing spherical probability measures via optimal transport methods. The new metrics are termed Stereographic Spherical Sliced Wasserstein (S3W) distance and its rotationally invariant variant (RI-S3W). The primary motivation is to improve computational efficiency and accuracy when comparing distributions defined on a hypersphere, which arise in various scientific and engineering applications, such as deep representation learning, computer vision, and medical imaging.
Methodological Contributions and Technical Highlights
Stereographic Projection and Radon Transform
The core idea involves leveraging the stereographic projection to map points from the sphere onto a plane. This projection is conformal, preserving local angles but not distances. To address the distortion of distances, the authors augment the stereographic projection with an injective map h, ensuring minimal distortion in the embedded space. The classic Radon transform and its variants, the Generalized Radon Transform (GRT) and an injective function-based GRT extension, are then applied to the projected measures.
Stereographic Spherical Sliced Wasserstein Distance
The S3W distance is introduced through the following steps:
- Stereographic Projection (SP): Points on the sphere Sd are mapped to Rd using SP.
- Injective Map: An injective map h is applied to manage the distortion from SP.
- Radon Transform: The Radon transform or its generalized variant, G, is applied to slices taken from the transformed distribution.
- Sliced Wasserstein Metric: High-dimensional distributions are compared using the p-Wasserstein distance between their one-dimensional projections.
Rotationally Invariant Spherical Sliced Wasserstein Distance
To enhance robustness, a rotationally invariant version of S3W (RI-S3W) is proposed. This involves averaging the distances over multiple random rotations from SO(d+1). This rotational averaging ensures that the distance metric remains invariant under spherical rotations, making it suitable for applications where orientation should not affect the comparison.
Experimental Evaluation
Gradient Flows on Spheres
The paper demonstrates the utility of the proposed distances in gradient flow problems, where the goal is to match a complex target distribution composed of multiple von Mises-Fisher distributions. Results show that S3W and RI-S3W significantly outperform existing methods like the Spherical Sliced Wasserstein (SSW) in terms of runtime and sometimes, final performance.
Self-Supervised Learning (SSL)
S3W-based uniformity loss is evaluated in a self-supervised learning framework for representation learning on the CIFAR-10 dataset. The results indicate that models trained with S3W and its variants provide competitive performance while offering reduced computational costs.
Sliced-Wasserstein Autoencoders (SWAE)
The S3W metrics are further tested in the context of SWAEs for generative modeling. Experiments on MNIST and CIFAR-10 indicate that S3W-based SWAEs achieve comparable or better performance concerning reconstruction quality and latent space regularity, while being computationally more efficient.
Density Estimation on the Sphere
The paper also explores density estimation using normalizing flows for spherical data like Earthquake and Fire datasets. The S3W metrics are shown to yield more accurate density estimates with faster training times compared to existing baselines.
Theoretical and Practical Implications
From a theoretical perspective, the introduction of stereographic projection combined with an injective map h to manage distortion represents a significant advancement. It offers a systematic way to ensure that the p-Wasserstein distance in the projected space closely approximates the geodesic distance on the sphere. Practically, the high computational efficiency and parallelizability of the proposed distances make them particularly appealing for large-scale applications, including deep learning and computer vision, where spherical data representations are prevalent.
Speculatively, future developments might include extending these techniques to other Riemannian manifolds or exploring their applicability in more diverse domains like astrophysics or climate modeling where spherical data structures are common.
In conclusion, the paper makes robust and pragmatic contributions to the field of optimal transport and spherical statistics. The proposed S3W and RI-S3W distances offer a blend of theoretical rigor and practical efficiency, promising broad applicability and setting a new standard in various scientific and engineering domains involving spherical data.