Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
124 tokens/sec
GPT-4o
8 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Statistical and Computational Guarantees of Kernel Max-Sliced Wasserstein Distances (2405.15441v3)

Published 24 May 2024 in stat.ML, cs.CC, and cs.LG

Abstract: Optimal transport has been very successful for various machine learning tasks; however, it is known to suffer from the curse of dimensionality. Hence, dimensionality reduction is desirable when applied to high-dimensional data with low-dimensional structures. The kernel max-sliced (KMS) Wasserstein distance is developed for this purpose by finding an optimal nonlinear mapping that reduces data into $1$ dimension before computing the Wasserstein distance. However, its theoretical properties have not yet been fully developed. In this paper, we provide sharp finite-sample guarantees under milder technical assumptions compared with state-of-the-art for the KMS $p$-Wasserstein distance between two empirical distributions with $n$ samples for general $p\in[1,\infty)$. Algorithm-wise, we show that computing the KMS $2$-Wasserstein distance is NP-hard, and then we further propose a semidefinite relaxation (SDR) formulation (which can be solved efficiently in polynomial time) and provide a relaxation gap for the obtained solution. We provide numerical examples to demonstrate the good performance of our scheme for high-dimensional two-sample testing.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (21)
  1. Bartl D, Mendelson S (2022) Structure preservation via the wasserstein distance. arXiv preprint arXiv:2209.07058 .
  2. Barvinok AI (1995) Problems of distance geometry and convex properties of quadratic maps. Discrete & Computational Geometry 13:189–202.
  3. Berlinet A, Thomas-Agnan C (2011) Reproducing kernel Hilbert spaces in probability and statistics (Springer Science & Business Media).
  4. Bertsekas DP (1997) Nonlinear programming. Journal of the Operational Research Society 48(3):334–334.
  5. Birkhoff G (1946) Tres observaciones sobre el algebra lineal. Univ. Nac. Tucuman, Ser. A 5:147–154.
  6. Boedihardjo MT (2024) Sharp bounds for max-sliced wasserstein distances. arXiv preprint arXiv:2403.00666 .
  7. Deng L (2012) The mnist database of handwritten digit images for machine learning research [best of the web]. IEEE signal processing magazine 29(6):141–142.
  8. Diamond S, Boyd S (2016) Cvxpy: A python-embedded modeling language for convex optimization. Journal of Machine Learning Research 17(83):1–5.
  9. Fisher RA (1988) Iris. UCI Machine Learning Repository, DOI: https://doi.org/10.24432/C56C76.
  10. Fournier N, Guillin A (2015) On the rate of convergence in wasserstein distance of the empirical measure. Probability theory and related fields 162(3):707–738.
  11. Gao R, Kleywegt A (2023) Distributionally robust stochastic optimization with wasserstein distance. Mathematics of Operations Research 48(2):603–655.
  12. Jiang B, Liu YF (2024) A riemannian exponential augmented lagrangian method for computing the projection robust wasserstein distance. Advances in Neural Information Processing Systems 36.
  13. Kuhn HW (1955) The hungarian method for the assignment problem. Naval research logistics quarterly 2(1-2):83–97.
  14. Li Y, Xie W (2022) On the exactness of dantzig-wolfe relaxation for rank constrained optimization problems. arXiv preprint arXiv:2210.16191 .
  15. Nguyen K, Ho N (2023) Energy-based sliced wasserstein distance. Advances in Neural Information Processing Systems 36.
  16. Pataki G (1998) On the rank of extreme matrices in semidefinite programs and the multiplicity of optimal eigenvalues. Mathematics of operations research 23(2):339–358.
  17. Paty FP, Cuturi M (2019) Subspace robust wasserstein distances. International conference on machine learning 5072–5081.
  18. Peyre G, Cuturi M (2019) Computational optimal transport: With applications to data science. Foundations and Trends in Machine Learning 11(5-6):355–607.
  19. Wainwright MJ (2019) High-dimensional statistics: A non-asymptotic viewpoint, volume 48 (Cambridge university press).
  20. Xie L, Xie Y (2021) Sequential change detection by optimal weighted ℓ2subscriptℓ2\ell_{2}roman_ℓ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT divergence. IEEE Journal on Selected Areas in Information Theory 1–1.
  21. Yeh IC (2016) Default of Credit Card Clients. UCI Machine Learning Repository, DOI: https://doi.org/10.24432/C55S3H.

Summary

We haven't generated a summary for this paper yet.