Hilbert's projective metric for functions of bounded growth and exponential convergence of Sinkhorn's algorithm (2311.04041v2)
Abstract: Motivated by the entropic optimal transport problem in unbounded settings, we study versions of Hilbert's projective metric for spaces of integrable functions of bounded growth. These versions of Hilbert's metric originate from cones which are relaxations of the cone of all non-negative functions, in the sense that they include all functions having non-negative integral values when multiplied with certain test functions. We show that kernel integral operators are contractions with respect to suitable specifications of such metrics even for kernels which are not bounded away from zero, provided that the decay to zero of the kernel is controlled. As an application to entropic optimal transport, we show exponential convergence of Sinkhorn's algorithm in settings where the marginal distributions have sufficiently light tails compared to the growth of the cost function.
- Asymptotics for semidiscrete entropic optimal transport. SIAM J. Math. Anal., 54(2):1718–1741, 2022.
- One-parameter semigroups of positive operators, volume 1184 of Lecture Notes in Mathematics. Springer-Verlag, Berlin, 1986.
- Wasserstein generative adversarial networks. In Proceedings of the 34th International Conference on Machine Learning, volume 70 of Proceedings of Machine Learning Research, pages 214–223. PMLR, 2017.
- Stochastic optimization for regularized Wasserstein estimators. In Proceedings of the 37th International Conference on Machine Learning, volume 119 of Proceedings of Machine Learning Research, pages 602–612. PMLR, 2020.
- A non-conservative Harris ergodic theorem. J. Lond. Math. Soc. (2), 106(3):2459–2510, 2022.
- Entropic optimal transport: geometry and large deviations. Duke Math. J., 171(16):3363–3400, 2022.
- G. Birkhoff. Extensions of Jentzsch’s theorem. Trans. Amer. Math. Soc., 85:219–227, 1957.
- M. Björklund. Central limit theorems for Gromov hyperbolic groups. J. Theoret. Probab., 23(3):871–887, 2010.
- Entropy minimization, DAD𝐷𝐴𝐷DADitalic_D italic_A italic_D problems, and doubly stochastic kernels. J. Funct. Anal., 123(2):264–307, 1994.
- Proximal optimal transport modeling of population dynamics. In Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, volume 151 of Proceedings of Machine Learning Research, pages 6511–6528. PMLR, 2022.
- G. Carlier. On the linear convergence of the multi-marginal Sinkhorn algorithm. SIAM J. Optim., 32(2):786–794, 2022.
- Lipschitz continuity of the Schrödinger map in entropic optimal transport. Preprint arXiv:2210.00225, 2022.
- Convergence of entropic schemes for optimal transport and gradient flows. SIAM J. Math. Anal., 49(2):1385–1418, 2017.
- Convergence rate of general entropic optimal transport costs. Calc. Var. Partial Differential Equations, 62(4):Paper No. 116, 28, 2023.
- Unsupervised learning of visual features by contrasting cluster assignments. In Advances in Neural Information Processing Systems, volume 33, pages 9912–9924. Curran Associates, Inc., 2020.
- Entropic and displacement interpolation: a computational approach using the Hilbert metric. SIAM J. Appl. Math., 76(6):2375–2396, 2016.
- Faster Wasserstein distance estimation with the Sinkhorn divergence. In Advances in Neural Information Processing Systems, volume 33, pages 2257–2269. Curran Associates, Inc., 2020.
- G. Conforti. Weak semiconvexity estimates for Schrödinger potentials and logarithmic Sobolev inequality for Schrödinger bridges. Preprint arXiv:2301.00083, 2022.
- Quantitative contraction rates for Sinkhorn algorithm: beyond bounded costs and compact marginals. Preprint arXiv:2304.04451, 2023.
- G. Conforti and L. Tamanini. A formula for the time derivative of the entropic cost and applications. J. Funct. Anal., 280(11):Paper No. 108964, 48, 2021.
- I. Csiszár. I𝐼Iitalic_I-divergence geometry of probability distributions and minimization problems. Ann. Probab., 3:146–158, 1975.
- M. Cuturi. Sinkhorn distances: Lightspeed computation of optimal transport. In Advances in Neural Information Processing Systems, volume 26, pages 2292–2300. Curran Associates, Inc., 2013.
- M. Cuturi and A. Doucet. Fast computation of Wasserstein barycenters. In Proceedings of the 31st International Conference on Machine Learning, volume 32 of Proceedings of Machine Learning Research, pages 685–693. PMLR, 2014.
- B. de Pagter. Irreducible compact operators. Mathematische Zeitschrift, 192(1):149–153, 1986.
- An improved central limit theorem and fast convergence rates for entropic transportation costs. SIAM J. Math. Data Sci., 5(3):639–669, 2023.
- Quantitative uniform stability of the iterative proportional fitting procedure. Preprint arXiv:2108.08129v1, 2021.
- S. Eckstein and M. Nutz. Quantitative stability of regularized optimal transport and convergence of Sinkhorn’s algorithm. SIAM J. Math. Anal., 54(6):5922–5948, 2022.
- S. Eckstein and M. Nutz. Convergence rates for regularized optimal transport via quantization. Math. Oper. Res., forthcoming, 2023.
- Spectral theory and differential operators. Oxford Mathematical Monographs. Oxford University Press, Oxford, 2018.
- A mass transportation approach to quantitative isoperimetric inequalities. Invent. Math., 182(1):167–211, 2010.
- Ota: Optimal transport assignment for object detection. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 303–312. IEEE Computer Society, 2021.
- Stochastic optimization for large-scale optimal transport. In Advances in Neural Information Processing Systems, volume 29, pages 3440–3448. Curran Associates, Inc., 2016.
- P. Ghosal and M. Nutz. On the convergence rate of Sinkhorn’s algorithm. Preprint arXiv:2212.06000, 2022.
- Non-asymptotic convergence bounds for Sinkhorn iterates and their gradients: a coupling approach. Preprint arXiv:2304.06549, 2023.
- D. Hilbert. Über die gerade Linie als kürzeste Verbindung zweier Punkte: Aus einem an Herrn F. Klein gerichteten Briefe. Math. Ann., 46(1):91–96, 1895.
- Topics in nonlinear analysis & applications. World Scientific Publishing Co., Inc., River Edge, NJ, 1997.
- S. Karlin. The existence of eigenvalues for integral operators. Trans. Amer. Math. Soc., 113:1–17, 1964.
- E. Kohlberg and J. W. Pratt. The contraction mapping approach to the Perron-Frobenius theory: Why Hilbert’s metric? Math. Oper. Res., 7(2):198–210, 1982.
- Neural estimation of the rate-distortion function for massive datasets. In 2022 IEEE International Symposium on Information Theory (ISIT), pages 608–613. IEEE, 2022.
- B. Lemmens and R. Nussbaum. Nonlinear Perron-Frobenius theory, volume 189 of Cambridge Tracts in Mathematics. Cambridge University Press, Cambridge, 2012.
- B. Lemmens and R. Nussbaum. Birkhoff’s version of Hilbert’s metric and its applications in analysis. Preprint arXiv:1304.7921, 2013.
- Hilbert and Thompson isometries on cones in JB-algebras. Math. Z., 292(3-4):1511–1547, 2019.
- C. Léonard. From the Schrödinger problem to the Monge-Kantorovich problem. J. Funct. Anal., 262(4):1879–1920, 2012.
- C. Liverani. Decay of correlations. Ann. of Math. (2), 142(2):239–301, 1995.
- J. Lott and C. Villani. Ricci curvature for metric-measure spaces via optimal transport. Ann. of Math. (2), 169(3):903–991, 2009.
- I. Marek. Frobenius theory of positive operators: Comparison theorems and applications. SIAM J. Appl. Math., 19:607–628, 1970.
- G. Mena and J. Niles-Weed. Statistical bounds for entropic optimal transport: sample complexity and the central limit theorem. In Advances in Neural Information Processing Systems, volume 32, pages 4541–4551. Curran Associates, Inc., 2019.
- L. Nenna and P. Pegon. Convergence rate of entropy-regularized multi-marginal optimal transport costs. Preprint arXiv:2307.03023, 2023.
- E. Nummelin. General irreducible Markov chains and nonnegative operators, volume 83 of Cambridge Tracts in Mathematics. Cambridge University Press, Cambridge, 1984.
- R. D. Nussbaum. Hilbert’s projective metric and iterated nonlinear maps. Mem. Amer. Math. Soc., 75(391):iv+137, 1988.
- M. Nutz. Introduction to entropic optimal transport. Lecture notes, Columbia University, 2021.
- M. Nutz and J. Wiesel. Entropic optimal transport: convergence of potentials. Probab. Theory Related Fields, 184(1-2):401–424, 2022.
- M. Nutz and J. Wiesel. Stability of Schrödinger potentials and convergence of Sinkhorn’s algorithm. Preprint arXiv:2201.10059v1, 2022.
- G. Peyré and M. Cuturi. Computational optimal transport: with applications to data science. Foundations and Trends in Machine Learning, 11(5-6):355–607, 2019.
- Handbook of integral equations. Chapman & Hall/CRC, Boca Raton, FL, second edition, 2008.
- A.-A. Pooladian and J. Niles-Weed. Entropic estimation of optimal transport maps. Preprint arXiv:2109.12004, 2021.
- S. T. Rachev. The Monge-Kantorovich problem on mass transfer and its applications in stochastics. Teor. Veroyatnost. i Primenen., 29(4):625–653, 1984.
- S. T. Rachev and L. Rüschendorf. Mass transportation problems. Vol. I. Probability and its Applications (New York). Springer-Verlag, New York, 1998.
- S. T. Rachev and L. Rüschendorf. Mass transportation problems. Vol. II. Probability and its Applications (New York). Springer-Verlag, New York, 1998.
- P. Rigollet and A. J. Stromme. On the sample complexity of entropic optimal transport. Preprint arXiv:2206.13472, 2022.
- L. Rüschendorf. Convergence of the iterative proportional fitting procedure. Ann. Statist., 23(4):1160–1174, 1995.
- Optimal-transport analysis of single-cell gene expression identifies developmental trajectories in reprogramming. Cell, 176(4):928–943.e22, 2019.
- B. Schmitzer. Stabilized sparse scaling algorithms for entropy regularized transport problems. SIAM J. Sci. Comput., 41(3):A1443–A1481, 2019.
- R. Sinkhorn. A relationship between arbitrary positive matrices and doubly stochastic matrices. Ann. Math. Statist., 35:876–879, 1964.
- R. Sinkhorn and P. Knopp. Concerning nonnegative matrices and doubly stochastic matrices. Pacific J. Math., 21:343–348, 1967.
- Convolutional Wasserstein distances: efficient optimal transportation on geometric domains. ACM Trans. Graph., 34(4):1–11, 2015.
- I. Steinwart and A. Christmann. Support vector machines. Information Science and Statistics. Springer New York, NY, 2008.
- C. Villani. Optimal transport, old and new, volume 338 of Grundlehren der Mathematischen Wissenschaften. Springer Berlin, Heidelberg, 2009.
- Learning Gaussian mixtures using the Wasserstein-Fisher-Rao gradient flow. Preprint arXiv:2301.01766, 2023.
- Estimating the rate-distortion function by Wasserstein gradient descent. In Advances in Neural Information Processing Systems, forthcoming, 2023.