Generalized Sobolev Transport for Probability Measures on a Graph (2402.04516v2)
Abstract: We study the optimal transport (OT) problem for measures supported on a graph metric space. Recently, Le et al. (2022) leverage the graph structure and propose a variant of OT, namely Sobolev transport (ST), which yields a closed-form expression for a fast computation. However, ST is essentially coupled with the $Lp$ geometric structure within its definition which makes it nontrivial to utilize ST for other prior structures. In contrast, the classic OT has the flexibility to adapt to various geometric structures by modifying the underlying cost function. An important instance is the Orlicz-Wasserstein (OW) which moves beyond the $Lp$ structure by leveraging the \emph{Orlicz geometric structure}. Comparing to the usage of standard $p$-order Wasserstein, OW remarkably helps to advance certain machine learning approaches. Nevertheless, OW brings up a new challenge on its computation due to its two-level optimization formulation. In this work, we leverage a specific class of convex functions for Orlicz structure to propose the generalized Sobolev transport (GST). GST encompasses the ST as its special case, and can be utilized for prior structures beyond the $Lp$ geometry. In connection with the OW, we show that one only needs to simply solve a univariate optimization problem to compute the GST, unlike the complex two-level optimization problem in OW. We empirically illustrate that GST is several-order faster than the OW. Moreover, we provide preliminary evidences on the advantages of GST for document classification and for several tasks in topological data analysis.
- Persistence images: A stable vector representation of persistent homology. Journal of Machine Learning Research, 18(1):218–252, 2017.
- Sobolev spaces. Elsevier, 2003.
- Faster high-accuracy log-concave sampling via algorithmic warm starts. arXiv preprint arXiv:2302.10249, 2023.
- Averaging on the Bures-Wasserstein manifold: Dimension-free convergence of gradient descent. Advances in Neural Information Processing Systems, 2021.
- Subspace embedding and linear regression with Orlicz norm. In International Conference on Machine Learning, pp. 224–233. PMLR, 2018.
- Spherical sliced-Wasserstein. In The Eleventh International Conference on Learning Representations, 2023.
- Proximal optimal transport modeling of population dynamics. In International Conference on Artificial Intelligence and Statistics, pp. 6511–6528. PMLR, 2022.
- The Schrödinger bridge between Gaussian measures has a closed form. In International Conference on Artificial Intelligence and Statistics, pp. 5802–5833. PMLR, 2023a.
- Learning single-cell perturbation responses using neural optimal transport. Nature Methods, 20(11):1759–1768, 2023b.
- Orlicz random Fourier features. The Journal of Machine Learning Research, 21(1):5739–5775, 2020.
- Cuturi, M. Sinkhorn distances: Lightspeed computation of optimal transport. In Advances in Neural Information Processing Systems, pp. 2292–2300, 2013.
- Fast distance oracles for any symmetric norm. Advances in Neural Information Processing Systems, 35:7304–7317, 2022.
- Persistent homology-a survey. Contemporary mathematics, 453:257–282, 2008.
- On the complexity of the optimal transport problem with graph-structured cost. In International Conference on Artificial Intelligence and Statistics, pp. 9147–9165. PMLR, 2022.
- Unbalanced minibatch optimal transport; applications to domain adaptation. In International Conference on Machine Learning, pp. 3186–3197. PMLR, 2021.
- Gorard, S. Revisiting a 90-year-old debate: the advantages of the mean deviation. British Journal of Educational Studies, 53(4):417–430, 2005.
- On excess mass behavior in Gaussian mixture models with Orlicz-Wasserstein distances. In International Conference on Machine Learning, ICML, volume 202, pp. 11847–11870. PMLR, 2023.
- Huber, P. J. Robust estimation of a location parameter. In Breakthroughs in statistics: Methodology and distribution, pp. 492–518. Springer, 1992.
- Entropic optimal transport between (unbalanced) Gaussian measures has a closed form. In Advances in neural information processing systems, 2020.
- Kell, M. On interpolation and curvature via Wasserstein geodesics. Advances in Calculus of Variations, 10(2):125–167, 2017.
- Neural optimal transport. In The Eleventh International Conference on Learning Representations, 2023.
- Light Schröodinger bridge. In The Eleventh International Conference on Learning Representations, 2024.
- Shape descriptors for non-rigid shapes with a single closed contour. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), volume 1, pp. 424–429, 2000.
- On robust optimal transport: Computational complexity and barycenter computation. Advances in Neural Information Processing Systems, 34:21947–21959, 2021.
- Tree-sliced variants of Wasserstein distances. In Advances in neural information processing systems, pp. 12283–12294, 2019.
- Sobolev transport: A scalable metric for probability measures with graph metrics. In International Conference on Artificial Intelligence and Statistics, pp. 9844–9868. PMLR, 2022.
- Scalable unbalanced Sobolev transport for measures on a graph. In International Conference on Artificial Intelligence and Statistics, pp. 8521–8560. PMLR, 2023.
- Entropy regularized optimal transport independence criterion. In International Conference on Artificial Intelligence and Statistics, pp. 11247–11279. PMLR, 2022.
- Orlicz space regularization of continuous optimal transport problems. Applied Mathematics & Optimization, 85(2):14, 2022.
- Fast optimal transport through sliced Wasserstein generalized geodesics. 2023.
- Statistical bounds for entropic optimal transport: sample complexity and the central limit theorem. In Advances in Neural Information Processing Systems, pp. 4541–4551, 2019.
- Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, pp. 3111–3119, 2013.
- Outlier-robust optimal transport. In International Conference on Machine Learning, pp. 7850–7860. PMLR, 2021.
- Müller, A. Integral probability metrics and their generating classes of functions. Advances in Applied Probability, 29(2):429–443, 1997.
- Musielak, J. Orlicz spaces and modular spaces, volume 1034. Springer, 2006.
- Asymptotic guarantees for learning generative models with the sliced-Wasserstein distance. In Advances in Neural Information Processing Systems, pp. 250–260, 2019.
- Markovian sliced Wasserstein distances: Beyond independent projections. Advances in Neural Information Processing Systems, 2023a.
- Quasi-Monte Carlo for 3D sliced Wasserstein. 2024.
- On unbalanced optimal transport: Gradient methods, sparsity and approximation error. The Journal of Machine Learning Research, 2023b.
- Many processors, little time: MCMC for partitions via optimal transport couplings. In International Conference on Artificial Intelligence and Statistics, pp. 3483–3514. PMLR, 2022.
- Outlier-robust optimal transport: Duality, structure, and statistical analysis. In International Conference on Artificial Intelligence and Statistics, pp. 11691–11719. PMLR, 2022.
- Outlier-robust Wasserstein DRO. Thirty-seventh Conference on Neural Information Processing Systems, 2023.
- Computational optimal transport. Foundations and Trends® in Machine Learning, 11(5-6):355–607, 2019.
- Wasserstein barycenter and its application to texture mixing. In International Conference on Scale Space and Variational Methods in Computer Vision, pp. 435–446, 2011.
- Theory of Orlicz spaces. Marcel Dekker, 1991.
- Term-weighting approaches in automatic text retrieval. Information processing & management, 24(5):513–523, 1988.
- Low-rank Sinkhorn factorization. International Conference on Machine Learning (ICML), 2021.
- Optimal-transport analysis of single-cell gene expression identifies developmental trajectories in reprogramming. Cell, 176(4):928–943, 2019.
- Efficient symmetric norm regression via linear sketching. Advances in Neural Information Processing Systems, 32, 2019.
- Sturm, K.-T. Generalized Orlicz spaces and Wasserstein distances for convex–concave scale functions. Bulletin des sciences mathématiques, 135(6-7):795–802, 2011.
- Fixed support tree-sliced Wasserstein barycenter. In Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, volume 151, pp. 1120–1137. PMLR, 2022.
- Optimal transport for structured data with application on graphs. In International Conference on Machine Learning, pp. 6275–6284. PMLR, 2019.
- Trajectorynet: A dynamic optimal transport network for modeling cellular dynamics. In International conference on machine learning, pp. 9526–9536. PMLR, 2020.
- Villani, C. Optimal transport: old and new, volume 338. Springer Science & Business Media, 2008.
- Two-sample test with kernel projected wasserstein distance. In Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, volume 151, pp. 8022–8055. PMLR, 2022.
- Estimation of smooth densities in Wasserstein distance. In Proceedings of the Thirty-Second Conference on Learning Theory, volume 99, pp. 3118–3119, 2019.
- Zhang, Z. Parameter estimation techniques: A tutorial with application to conic fitting. Image and vision Computing, 15(1):59–76, 1997.