Lower Complexity Adaptation for Empirical Entropic Optimal Transport (2306.13580v4)
Abstract: Entropic optimal transport (EOT) presents an effective and computationally viable alternative to unregularized optimal transport (OT), offering diverse applications for large-scale data analysis. In this work, we derive novel statistical bounds for empirical plug-in estimators of the EOT cost and show that their statistical performance in the entropy regularization parameter $\epsilon$ and the sample size $n$ only depends on the simpler of the two probability measures. For instance, under sufficiently smooth costs this yields the parametric rate $n{-1/2}$ with factor $\epsilon{-d/2}$, where $d$ is the minimum dimension of the two population measures. This confirms that empirical EOT also adheres to the lower complexity adaptation principle, a haLLMark feature only recently identified for unregularized OT. As a consequence of our theory, we show that the empirical entropic Gromov-Wasserstein distance and its unregularized version for measures on Euclidean spaces also obey this principle. Additionally, we comment on computational aspects and complement our findings with Monte Carlo simulations. Our techniques employ empirical process theory and rely on a dual formulation of EOT over a single function class. Crucial to our analysis is the observation that the entropic cost-transformation of a function class does not increase its uniform metric entropy by much.
- Jason Altschuler, Jonathan Niles-Weed and Philippe Rigollet “Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration” In Advances in Neural Information Processing Systems 30, 2017
- Jason M Altschuler, Jonathan Niles-Weed and Austin J Stromme “Asymptotics for semidiscrete entropic optimal transport” In SIAM Journal on Mathematical Analysis 54.2 SIAM, 2022, pp. 1718–1741
- Martin Arjovsky, Soumith Chintala and Léon Bottou “Wasserstein generative adversarial networks” In International Conference on Machine Learning, 2017, pp. 214–223 PMLR
- “Tests of goodness of fit based on the L2superscript𝐿2{L}^{2}italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT-Wasserstein distance” In The Annals of Statistics 27.4 JSTOR, 1999, pp. 1230–1239
- “An improved central limit theorem and fast convergence rates for entropic transportation costs” In Preprint arXiv:2204.09105, 2022
- “The statistical effect of entropic regularization in optimal transportation” In Preprint arXiv:2006.05199, 2020
- Erhan Bayraktar, Stephan Eckstein and Xin Zhang “Stability and sample complexity of divergence regularized optimal transport” In Preprint arXiv:2212.00367, 2022
- Espen Bernton, Promit Ghosal and Marcel Nutz “Entropic optimal transport: Geometry and large deviations” In Duke Mathematical Journal 171.16 Duke University Press, 2022, pp. 3363–3400
- Dimitri P. Bertsekas “A new algorithm for the assignment problem” In Mathematical Programming 21.1 Springer, 1981, pp. 152–171
- Dimitri P. Bertsekas and David A. Castanon “The auction algorithm for the transportation problem” In Annals of Operations Research 20.1 Springer, 1989, pp. 67–96
- J’er’emie Bigot, Elsa Cazelles and Nicolas Papadakis “Central limit theorems for entropy-regularized optimal transport on finite spaces and statistical applications” In Electronic Journal of Statistics 13.2, 2019, pp. 5120–5150
- Emmanuel Boissard and Thibaut Le Gouic “On the mean speed of convergence of empirical and occupation measures in Wasserstein distance” In Annales de l’Institut Henri Poincaré, Probabilités et Statistiques 50.2, 2014, pp. 539–563
- “A survey of optimal transport for computer graphics and computer vision” In Computer Graphics Forum 42.2, 2023, pp. 439–460 Wiley Online Library
- Efim Mikhailovich Bronshtein “ε𝜀\varepsilonitalic_ε-entropy of convex sets and functions” In Siberian Mathematical Journal 17.3 Springer, 1976, pp. 393–398
- “Faster Wasserstein distance estimation with the Sinkhorn divergence” In Advances in Neural Information Processing Systems 33 Curran Associates, Inc., 2020, pp. 2257–2269
- Clayton W. Commander “A survey of the quadratic assignment problem, with applications” In Morehead Electronic Journal of Applicable Mathematics 4, 2005, pp. MATH-2005–01
- “A multivariate Faa di Bruno formula with applications” In Transactions of the American Mathematical Society 348.2, 1996, pp. 503–520
- “Joint distribution optimal transportation for domain adaptation” In Advances in Neural Information Processing Systems 30, 2017
- Nicolas Courty, Rémi Flamary and Devis Tuia “Domain adaptation with regularized optimal transport” In Machine Learning and Knowledge Discovery in Databases, 2014, pp. 274–289 Springer
- Marco Cuturi “Sinkhorn distances: Lightspeed computation of optimal transport” In Advances in Neural Information Processing Systems 26 Curran Associates, Inc., 2013
- Nabarun Deb, Promit Ghosal and Bodhisattva Sen “Rates of estimation of optimal transport maps using plug-in estimators via barycentric projections” In Advances in Neural Information Processing Systems 34 Curran Associates, Inc., 2021
- Alex Delalande “Nearly tight convergence bounds for semi-discrete entropic optimal transport” In International Conference on Artificial Intelligence and Statistics, 2022, pp. 1619–1642 PMLR
- Richard Mansfield Dudley “The speed of mean Glivenko-Cantelli convergence” In The Annals of Mathematical Statistics 40.1 JSTOR, 1969, pp. 40–50
- Pavel Dvurechensky, Alexander Gasnikov and Alexey Kroshnin “Computational optimal transport: Complexity by accelerated gradient descent is better than by Sinkhorn’s algorithm” In International Conference on Machine Learning, 2018, pp. 1367–1376 PMLR
- “Convergence rates for regularized optimal transport via quantization” In Preprint arXiv:2208.14391, 2022
- “Quantitative stability of regularized optimal transport and convergence of Sinkhorn’s algorithm” In SIAM Journal on Mathematical Analysis 54.6, 2022, pp. 5922–5948
- Steven N. Evans and Frederick A. Matsen “The phylogenetic Kantorovich–Rubinstein metric for environmental sequence samples” In Journal of the Royal Statistical Society: Series B (Statistical Methodology) 74.3 Wiley Online Library, 2012, pp. 569–592
- “Interpolating between optimal transport and MMD using Sinkhorn divergences” In The 22nd International Conference on Artificial Intelligence and Statistics, 2019, pp. 2681–2690 PMLR
- “Optimal transport for domain adaptation” In IEEE Transactions on Pattern Analysis and Machine Intelligence 1, 2016
- “On the rate of convergence in Wasserstein distance of the empirical measure” In Probability Theory and Related Fields 162.3 Springer, 2015, pp. 707–738
- Alfred Galichon “Optimal transport methods in economics” Princeton University Press, 2018
- “Sample complexity of Sinkhorn divergences” In The 22nd International Conference on Artificial Intelligence and Statistics 89, Proceedings of Machine Learning Research PMLR, 2019, pp. 1574–1583
- “Mathematical Foundations of Infinite-Dimensional Statistical Models”, Cambridge Series in Statistical and Probabilistic Mathematics Cambridge University Press, 2015
- “Limit theorems for entropic optimal transport maps and the Sinkhorn divergence” In Preprint arXiv:2207.08683, 2022
- “Statistical inference with regularized optimal transport” In Preprint arXiv:2205.04283, 2022
- “Weak limits for empirical entropic optimal transport: Beyond smooth costs” In Preprint arXiv:2305.09745, 2023
- Alberto González-Sanz, Jean-Michel Loubes and Jonathan Niles-Weed “Weak limits of entropy regularized optimal transport; potentials, plans and divergences” In Preprint arXiv:2207.07427, 2022
- Edouard Grave, Armand Joulin and Quentin Berthet “Unsupervised alignment of embeddings with Wasserstein procrustes” In The 22nd International Conference on Artificial Intelligence and Statistics, 2019, pp. 1880–1890 PMLR
- “Improved training of Wasserstein GANs” In Advances in Neural Information Processing Systems 30, 2017
- “Covering numbers for convex functions” In IEEE Transactions on Information Theory 59.4, 2013, pp. 1957–1965 IEEE
- Marc Hallin, Daniel Hlubinka and Šárka Hudecová “Efficient fully distribution-free center-outward rank tests for multiple-output regression and MANOVA” In Journal of the American Statistical Association Taylor & Francis, 2022, pp. 1–17
- “Center-outward multiple-output Lorenz curves and Gini indices a measure transportation approach” In Preprint arXiv:2211.10822, 2022
- “Multilevel clustering via Wasserstein means” In International Conference on Machine Learning, 2017, pp. 1501–1509 PMLR
- Shayan Hundrieser, Marcel Klatt and Axel Munk “Limit distributions and sensitivity analysis for empirical entropic optimal transport on countable spaces” In The Annals of Applied Probability [To appear, preprint arXiv:2105.00049], 2023
- Shayan Hundrieser, Thomas Staudt and Axel Munk “Empirical optimal transport between different measures adapts to lower complexity” In Annales de l’Institut Henri Poincaré, Probabilités et Statistiques [To Appear, preprint arXiv:2202.10434], 2023
- “Entropic optimal transport between unbalanced Gaussian measures has a closed form” In Advances in Neural Information Processing Systems 33, 2020, pp. 10468–10479
- L. Kantorovitch “On the translocation of masses” In Doklady Akademii Nauk URSS 37, 1942, pp. 7–8
- L. Kantorovitch “On the translocation of masses” In Management Science 5.1, 1958, pp. 1–4
- Marcel Klatt, Carla Tameling and Axel Munk “Empirical regularized optimal transport: Statistical theory and applications” In SIAM Journal on Mathematics of Data Science 2.2 SIAM, 2020, pp. 419–443
- “ε𝜀\varepsilonitalic_ε-entropy and ε𝜀\varepsilonitalic_ε-capacity of sets in functional spaces” In Twelve Papers on Algebra and Real Functions, American Mathematical Society Translations–series 2 American Mathematical Society, 1961, pp. 277–364
- John M Lee “Introduction to smooth manifolds” 218, Graduate Texts in Mathematics Springer, 2013
- “Riemannian manifold learning” In IEEE Transactions on Pattern Analysis and Machine Intelligence 30.5 IEEE, 2008, pp. 796–809
- Jianzhou Luo, Dingchuan Yang and Ke Wei “Improved complexity analysis of the Sinkhorn and Greenkhorn algorithms for optimal transport” In Preprint arXiv:2305.14939, 2023
- “Distance-based classification with Lipschitz functions” In Journal of Machine Learning Research 5, 2004, pp. 669–695
- Anton Mallasto, Augusto Gerolin and Hà Quang Minh “Entropy-regularized 2-Wasserstein distance between Gaussian measures” In Information Geometry 5.1 Springer, 2022, pp. 289–323
- “Sharp convergence rates for empirical optimal transport with smooth costs” In The Annals of Applied Probability [To appear, preprint arXiv:2106.13181], 2023
- Simone Di Marino and Augusto Gerolin “An optimal transport approach for the Schrödinger bridge problem and convergence of Sinkhorn algorithm” In Journal of Scientific Computing 85.2 Springer, 2020, pp. 1–28
- F. Mémoli “Gromov–Wasserstein distances and the metric approach to object matching” In Foundations of computational mathematics 11.4 Springer, 2011, pp. 417–487
- “Statistical bounds for entropic optimal transport: Sample complexity and the central limit theorem” In Advances in Neural Information Processing Systems 32 Curran Associates, Inc., 2019
- G. Monge “Mémoire sur la théeorie des déblais et des remblais” In Histoire de l’Académie Royale des Sciences de Paris, 1781, pp. 666–704
- “Measuring dependence between random vectors via optimal transport” In Journal of Multivariate Analysis 189 Elsevier, 2022, pp. 104912
- “Nonparametric validation of similar distributions and assessment of goodness of fit” In Journal of the Royal Statistical Society: Series B (Statistical Methodology) 60.1, 1998, pp. 223–241
- Thomas Giacomo Nies, Thomas Staudt and Axel Munk “Transport dependency: Optimal transport based dependency measures” In Preprint arXiv:2105.02073, 2021
- “Estimation of Wasserstein distances in the spiked transport model” In Bernoulli 28.4 Bernoulli Society for Mathematical StatisticsProbability, 2022, pp. 2663–2688
- Marcel Nutz “Introduction to entropic optimal transport” In Lecture notes, Columbia University, 2021
- “Entropic optimal transport: Convergence of potentials” In Probability Theory and Related Fields 184.1-2 Springer, 2022, pp. 401–424
- James Orlin “A faster strongly polynomial minimum cost flow algorithm” In Proceedings of the Twentieth annual ACM symposium on Theory of Computing, 1988, pp. 377–387
- Soumik Pal “On the difference between entropic cost and the optimal transport cost” In The Annals of Applied Probability [To appear, preprint arXiv:1905.12206], 2023
- Victor M Panaretos and Yoav Zemel “An invitation to statistics in Wasserstein space” Springer Nature, 2020
- Victor M. Panaretos and Yoav Zemel “Statistical aspects of Wasserstein distances” In Annual Review of Statistics and Its Application 6 Annual Reviews, 2019, pp. 405–431
- “Computational optimal transport” In Foundations and Trends in Machine Learning 11.5-6, 2019, pp. 355–607
- Gabriel Peyré, Marco Cuturi and Justin Solomon “Gromov-Wasserstein averaging of kernel and distance matrices” In International Conference on Machine Learning, 2016, pp. 2664–2672 PMLR
- “Entropic estimation of optimal transport maps” In Preprint arXiv:2109.12004, 2021
- “Mass transportation problems - Volume I: Theory” Springer, 1998
- “Mass transportation problems - Volume II: Applications” Springer, 1998
- Philippe Rigollet and Austin J. Stromme “On the sample complexity of entropic optimal transport” In Preprint arXiv:2206.13472, 2022
- Gabriel Rioux, Ziv Goldfeld and Kengo Kato “Entropic Gromov-Wasserstein distances: Stability, algorithms, and distributional limits” In Preprint arXiv:2306.00182, 2023
- F. Santambrogio “Optimal Transport for Applied Mathematicians” Springer, 2015
- “Optimal-transport analysis of single-cell gene expression identifies developmental trajectories in reprogramming” In Cell 176.4 Elsevier, 2019, pp. 928–943
- Bernhard Schmitzer “Stabilized sparse scaling algorithms for entropy regularized transport problems” In SIAM Journal on Scientific Computing 41.3 SIAM, 2019, pp. A1443–A1481
- “Convolutional Wasserstein distances: Efficient optimal transportation on geometric domains” In ACM Transactions on Graphics 34.4 ACM New York, NY, USA, 2015, pp. 1–11
- “Entropic metric alignment for correspondence problems” In ACM Transactions on Graphics 35.4 ACM New York, NY, USA, 2016, pp. 1–13
- “Inference for empirical Wasserstein distances on finite spaces” In Journal of the Royal Statistical Society. Series B (Statistical Methodology) 80.1 JSTOR, 2018, pp. 219–238
- “Convergence of empirical optimal transport in unbounded settings” In Preprint arXiv:2306.11499, 2023
- Austin J. Stromme “Minimum intrinsic dimension scaling for entropic optimal transport” In Preprint arXiv:2306.03398, 2023
- Karl-Theodor Sturm “The space of spaces: Curvature bounds and gradient flows on the space of metric measure spaces” In Preprint arXiv:1208.0434v2, 2012
- Ameet Talwalkar, Sanjiv Kumar and Henry Rowley “Large-scale manifold learning” In 2008 IEEE Conference on Computer Vision and Pattern Recognition, 2008, pp. 1–8 IEEE
- “Colocalization for super-resolution microscopy via optimal transport” In Nature computational science 1.3 Nature Publishing Group US New York, 2021, pp. 199–211
- Aad W. Vaart and John A. Wellner “Weak Convergence and Empirical Processes: With Applications to Statistics”, Springer Series in Statistics Springer, 1996
- Roman Vershynin “High-dimensional probability: An introduction with applications in data science” Cambridge university press, 2018
- C. Villani “Optimal Transport: Old and New” Springer, 2009
- C. Villani “Topics in optimal transportation” American Mathematical Society, 2003
- Martin J. Wainwright “High-Dimensional Statistics: A Non-Asymptotic Viewpoint”, Cambridge Series in Statistical and Probabilistic Mathematics Cambridge University Press, 2019
- Shulei Wang, T.Tony Cai and Hongzhe Li “Optimal estimation of Wasserstein distance on a tree with an application to microbiome studies” In Journal of the American Statistical Association 116.535 Taylor & Francis, 2021, pp. 1237–1253
- “Sharp asymptotic and finite-sample rates of convergence of empirical measures in Wasserstein distance” In Bernoulli 25.4A Bernoulli Society for Mathematical StatisticsProbability, 2019, pp. 2620–2648
- “Distribution of distances based object matching: Asymptotic inference” In Journal of the American Statistical Association Taylor & Francis, 2022
- “Gromov-Wasserstein distances: Entropic regularization, duality, and sample complexity” In Preprint arXiv:2212.12848, 2022
- “Image reconstruction by domain-transform manifold learning” In Nature 555.7697 Nature Publishing Group UK London, 2018, pp. 487–492