Compressed online Sinkhorn (2310.05019v1)
Abstract: The use of optimal transport (OT) distances, and in particular entropic-regularised OT distances, is an increasingly popular evaluation metric in many areas of machine learning and data science. Their use has largely been driven by the availability of efficient algorithms such as the Sinkhorn algorithm. One of the drawbacks of the Sinkhorn algorithm for large-scale data processing is that it is a two-phase method, where one first draws a large stream of data from the probability distributions, before applying the Sinkhorn algorithm to the discrete probability measures. More recently, there have been several works developing stochastic versions of Sinkhorn that directly handle continuous streams of data. In this work, we revisit the recently introduced online Sinkhorn algorithm of [Mensch and Peyr\'e, 2020]. Our contributions are twofold: We improve the convergence analysis for the online Sinkhorn algorithm, the new rate that we obtain is faster than the previous rate under certain parameter choices. We also present numerical results to verify the sharpness of our result. Secondly, we propose the compressed online Sinkhorn algorithm which combines measure compression techniques with the online Sinkhorn algorithm. We provide numerical experiments to show practical numerical gains, as well as theoretical guarantees on the efficiency of our approach.
- Fast Bayesian inference with batch Bayesian quadrature via kernel recombination. Advances in Neural Information Processing Systems, 35:16533–16547.
- Massively scalable Sinkhorn distances via the Nyström method. Advances in Neural Information Processing Systems, 32:4427–4437.
- Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration. arXiv preprint arXiv:1705.09634.
- Stochastic optimization for large-scale optimal transport. arXiv preprint arXiv:1605.08527.
- Kernel operations on the GPU, with autodiff, without memory overflows. Journal of Machine Learning Research, 22(74):1–6.
- A randomized algorithm to reduce the support of discrete measures. Advances in Neural Information Processing Systems, 33:15100–15110.
- Cuturi, M. (2013). Sinkhorn distances: Lightspeed computation of optimal transport. Advances in Neural Information Processing Systems, 26:2292–2300.
- Regularized discrete optimal transport. SIAM Journal on Imaging Sciences, 7(3):1853–1882.
- Gautschi, W. (2004). Orthogonal polynomials: computation and approximation. OUP Oxford.
- Sample complexity of Sinkhorn divergences. In The 22nd international conference on artificial intelligence and statistics, pages 1574–1583. PMLR.
- Positively weighted kernel quadrature via subsampling. Advances in Neural Information Processing Systems, 35:6886–6900.
- Entropic optimal transport between unbalanced Gaussian measures has a closed form. Advances in Neural Information Processing Systems, 33:10468–10479.
- Kantorovich, L. (1942). On the transfer of masses (in Russian).
- A practical guide to quasi-Monte Carlo methods. Lecture notes.
- Online Sinkhorn: Optimal transport distances from sample streams. arXiv e-prints, pages arXiv–2003.
- Mischler, S. (2018). Lecture notes: An Introduction to Evolution PDEs.
- Non-asymptotic analysis of stochastic approximation algorithms for machine learning. Advances in neural information processing systems, 24:451–459.
- Local histogram based segmentation using the Wasserstein distance. International journal of computer vision, 84:97–111.
- Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830.
- Computational optimal transport: With applications to data science. Foundations and Trends® in Machine Learning, 11(5-6):355–607.
- Optimal-transport analysis of single-cell gene expression identifies developmental trajectories in reprogramming. Cell, 176(4):928–943.
- Tchernychova, M. (2015). Caratheodory cubature measures. PhD thesis, Oxford.
- Van der Vaart, A. W. (2000). Asymptotic statistics, volume 3. Cambridge university press.
- Vialard, F.-X. (2019). An elementary introduction to entropic regularization and proximal methods for numerical optimal transport. Lecture notes.
- SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python. Nature Methods, 17:261–272.
- Vocabulary learning via optimal transport for neural machine translation. arXiv preprint arXiv:2012.15671.
- Improved Nyström low-rank approximation and error analysis. In Proceedings of the 25th international conference on Machine learning, pages 1232–1239.