Papers
Topics
Authors
Recent
Search
2000 character limit reached

Comparing Multivariate Distributions: A Novel Approach Using Optimal Transport-based Plots

Published 30 Apr 2024 in stat.ME, math.ST, and stat.TH | (2404.19700v1)

Abstract: Quantile-Quantile (Q-Q) plots are widely used for assessing the distributional similarity between two datasets. Traditionally, Q-Q plots are constructed for univariate distributions, making them less effective in capturing complex dependencies present in multivariate data. In this paper, we propose a novel approach for constructing multivariate Q-Q plots, which extend the traditional Q-Q plot methodology to handle high-dimensional data. Our approach utilizes optimal transport (OT) and entropy-regularized optimal transport (EOT) to align the empirical quantiles of the two datasets. Additionally, we introduce another technique based on OT and EOT potentials which can effectively compare two multivariate datasets. Through extensive simulations and real data examples, we demonstrate the effectiveness of our proposed approach in capturing multivariate dependencies and identifying distributional differences such as tail behaviour. We also propose two test statistics based on the Q-Q and potential plots to compare two distributions rigorously.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (23)
  1. Y. Brenier. Polar factorization and monotone rearrangement of vector-valued functions. Communications on Pure and Applied Mathematics, 44:375–417, 1991.
  2. Monge-kantorovich depth, quantiles, ranks and signs. Annals of Statistics, 45(1):223–256, 2017.
  3. M. Cuturi. Sinkhorn distances: Lightspeed computation of optimal transportation distances. Advances in Neural Information Processing Systems, 26:2292–2300, 2013.
  4. C. de Valk and J. Segers. Tails of optimal transport plans for regularly varying probability measures. arXiv preprint:1811.12061, 2018.
  5. Comparison of multivariate distributions using quantile–quantile plots and related tests. Bernoulli, 20(3):1484–1506, 2014.
  6. A multivariate generalization of quantile-quantile plots. Journal of the American Statistical Association, 85(410):376–386, 1990.
  7. Pot: Python optimal transport. Journal of Machine Learning Research, 22(78):1–8, 2021. URL http://jmlr.org/papers/v22/20-451.html.
  8. A. Genevay. Entropy-regularized optimal transport for machine learning. Ph.D. thesis, 2019. URL https://audeg.github.io/publications/these_aude.pdf.
  9. P. Ghosal and B. Sen. Multivariate ranks and quantiles using optimal transport: Consistency, rates and nonparametric testing. The Annals of Statistics, 50(2):1012–1037, 2022.
  10. Limit theorems for entropic optimal transport maps and the sinkhorn divergence. arXiv preprint arXiv:2207.08683, 2023.
  11. Distribution and quantile functions, ranks and signs in dimension d: A measure transportation approach. Annals of Statistics, 49:1139–1165, 2021.
  12. J. C. Hütter and P. Rigollet. Minimax estimation of smooth optimal transport maps. The Annals of Statistics, 49(2):1166 – 1194, 2021.
  13. Central limit theorems for smooth optimal transport maps. arXiv:2312.12407, 2023.
  14. R. J. McCann. Existence and uniqueness of monotone measure-preserving maps. Duke Mathematical Journal, 80:309–323, 1995.
  15. M. Nutz. Introduction to entropic optimal transport. Lecture notes, available at https://www.math.columbia.edu/~mnutz/docs/EOT_lecture_notes.pdf, 2022.
  16. A.-A. Pooladian and J. Niles-Weed. Entropic estimation of optimal transport maps. arXiv:2109.12004, 2022.
  17. Stability and statistical inference for semidiscrete optimal transport maps. arXiv:2303.10155, 2023.
  18. F Santambrogio. Optimal Transport for Applied Mathematicians: Calculus of Variations, PDEs, and Modeling. Springer, 2015.
  19. An analysis of variance test for normality: Complete samples. Biometrika, 52:591–611, 1965.
  20. From geometric quantiles to halfspace depths: A geometric approach for extremal behaviour. arXiv:2306.10789 & ESSEC WP2307, 2023.
  21. A. W. van der Vaart and J. A. Wellner. Weak Convergence. Springer New York, 1996.
  22. C. Villani. Topics in Optimal Transportation. American Mathematical Society, 2003.
  23. C. Villani. Optimal Transport Old and New. Springer, 2009.

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 0 likes about this paper.