GeONet: a neural operator for learning the Wasserstein geodesic (2209.14440v4)
Abstract: Optimal transport (OT) offers a versatile framework to compare complex data distributions in a geometrically meaningful way. Traditional methods for computing the Wasserstein distance and geodesic between probability measures require mesh-specific domain discretization and suffer from the curse-of-dimensionality. We present GeONet, a mesh-invariant deep neural operator network that learns the non-linear mapping from the input pair of initial and terminal distributions to the Wasserstein geodesic connecting the two endpoint distributions. In the offline training stage, GeONet learns the saddle point optimality conditions for the dynamic formulation of the OT problem in the primal and dual spaces that are characterized by a coupled PDE system. The subsequent inference stage is instantaneous and can be deployed for real-time predictions in the online learning setting. We demonstrate that GeONet achieves comparable testing accuracy to the standard OT solvers on simulation examples and the MNIST dataset with considerably reduced inference-stage computational cost by orders of magnitude.
- Near-linear time approximation algorithms for optimal transport via sinkhorn iteration. In Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS’17, page 1961–1971, Red Hook, NY, USA, 2017. Curran Associates Inc. ISBN 9781510860964.
- Gradient Flows in Metric Spaces and in the Space of Probability Measures. Lectures in Mathematics ETH Zürich. Birkhäuser Baseluser Basel, second edition, 2008.
- Wasserstein generative adversarial networks. In Doina Precup and Yee Whye Teh, editors, Proceedings of the 34th International Conference on Machine Learning, volume 70 of Proceedings of Machine Learning Research, pages 214–223. PMLR, 06–11 Aug 2017. URL https://proceedings.mlr.press/v70/arjovsky17a.html.
- A computational fluid mechanics solution to the monge-kantorovich mass transfer problem. Numerische Mathematik, 84:375–393, 2000.
- Iterative bregman projections for regularized transportation problems. SIAM Journal on Scientific Computing, 37(2):A1111–A1138, 2015. doi: 10.1137/141000439. URL https://doi.org/10.1137/141000439.
- Dimitri P. Bertsekas and David A. Castanon. The auction algorithm for the transportation problem. Annals of Operations Research, 20(1):67–96, December 1989. ISSN 0254-5330. doi: 10.1007/BF02216923.
- Yann Brenier. Polar factorization and monotone rearrangement of vector-valued functions. Communications on Pure and Applied Mathematics, 44(4):375–417, 1991. doi: https://doi.org/10.1002/cpa.3160440402. URL https://onlinelibrary.wiley.com/doi/abs/10.1002/cpa.3160440402.
- A Course in Metric Geometry. Graduate studies in mathematics. American mathematical society, Providence, Rhode Island, 2001.
- Universal approximation to nonlinear operators by neural networks with arbitrary activation functions and its application to dynamical systems. IEEE Transactions on Neural Networks, 6(4):911–917, 1995. doi: 10.1109/72.392253.
- Cutoff for exact recovery of gaussian mixture models. IEEE Transactions on Information Theory, 67(6):4223–4238, 2021. doi: 10.1109/TIT.2021.3063155.
- On the relation between optimal transport and schrödinger bridges: A stochastic control viewpoint. Journal of Optimization Theory and Applications, 169(2):671–691, 2016. doi: 10.1007/s10957-015-0803-z. URL https://doi.org/10.1007/s10957-015-0803-z.
- Optimal transport in systems and control. Annual Review of Control, Robotics, and Autonomous Systems, 4(1):89–113, 2021. doi: 10.1146/annurev-control-070220-100858. URL https://doi.org/10.1146/annurev-control-070220-100858.
- Optimal transport for domain adaptation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(9):1853–1865, 2017. doi: 10.1109/TPAMI.2016.2615921.
- Marco Cuturi. Sinkhorn distances: Lightspeed computation of optimal transport. In C.J. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K.Q. Weinberger, editors, Advances in Neural Information Processing Systems, volume 26. Curran Associates, Inc., 2013. URL https://proceedings.neurips.cc/paper/2013/file/af21d0c97db2e27e13572cbf59eb343d-Paper.pdf.
- Computational optimal transport: Complexity by accelerated gradient descent is better than by sinkhorn’s algorithm. In Jennifer Dy and Andreas Krause, editors, Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, pages 1367–1376. PMLR, 10–15 Jul 2018.
- Lawrence C. Evans. Partial Differential Equations. American Mathematical Society, Providence, R.I., 2010. ISBN 9780821849743 0821849743.
- "how to train your neural ode: the world of jacobian and kinetic regularization". In ICML, 2020.
- Pot: Python optimal transport. Journal of Machine Learning Research, 22(78):1–8, 2021.
- Stochastic optimization for large-scale optimal transport. In Proceedings of the 30th International Conference on Neural Information Processing Systems, NIPS’16, page 3440–3448, Red Hook, NY, USA, 2016. Curran Associates Inc. ISBN 9781510838819.
- Amortized inference in probabilistic reasoning. Cognitive Science, 36, 2014.
- Optimal transport-based coverage control for swarm robot systems: Generalization of the voronoi tessellation-based method. In 2021 American Control Conference (ACC), pages 3032–3037, 2021. doi: 10.23919/ACC50511.2021.9483194.
- Empirical regularized optimal transport: Statistical theory and applications. SIAM Journal on Mathematics of Data Science, 2(2):419–443, 2020. doi: 10.1137/19M1278788.
- Neural operator: Learning maps between function spaces, 2021. URL https://arxiv.org/abs/2108.08481.
- Distributed optimal transport for the deployment of swarms. In 2018 IEEE Conference on Decision and Control (CDC), pages 4583–4588, 2018. doi: 10.1109/CDC.2018.8619816.
- H. W. Kuhn. The hungarian method for the assignment problem. Naval Research Logistics Quarterly, 2(1-2):83–97, 1955. doi: https://doi.org/10.1002/nav.3800020109. URL https://onlinelibrary.wiley.com/doi/abs/10.1002/nav.3800020109.
- From word embeddings to document distances. In Proceedings of the 32nd International Conference on International Conference on Machine Learning - Volume 37, ICML’15, page 957–966. JMLR.org, 2015.
- Neural operator: Graph kernel network for partial differential equations, 2020a. URL https://arxiv.org/abs/2003.03485.
- Fourier neural operator for parametric partial differential equations, 2020b. URL https://arxiv.org/abs/2010.08895.
- Physics-informed neural operator for learning partial differential equations, 2021. URL https://arxiv.org/abs/2111.03794.
- Learning high dimensional wasserstein geodesics, 2021. URL https://arxiv.org/abs/2102.02992.
- Learning nonlinear operators via deeponet based on the universal approximation theorem of operators. Nat Mach Intell, 3:218–229, 2021.
- Linear and Nonlinear Programming. Springer Publishing Company, Incorporated, 2015. ISBN 3319188410.
- Robert J. McCann. A convexity principle for interacting gases. Advances in Mathematics, 128(1):153–179, 1997. URL https://www.sciencedirect.com/science/article/pii/S0001870897916340.
- Toshio Mikami. Monge’s problem with a quadratic cost by the zero-noise limit of h-path processes. Probability Theory and Related Fields, 129(2):245–260, 2004. doi: 10.1007/s00440-004-0340-4. URL https://doi.org/10.1007/s00440-004-0340-4.
- Approximation by finite mixtures of continuous density functions that vanish at infinity. Cogent Mathematics & Statistics, 7(1):1750861, 2020. doi: 10.1080/25742558.2020.1750861. URL https://doi.org/10.1080/25742558.2020.1750861.
- Computational optimal transport. Foundations and Trends in Machine Learning, 11(5-6):355–607, 2019.
- Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational Physics, 378:686–707, 2019. ISSN 0021-9991. doi: https://doi.org/10.1016/j.jcp.2018.10.045. URL https://www.sciencedirect.com/science/article/pii/S0021999118307125.
- Filippo Santambrogio. Optimal Transport for Applied Mathematicians. Calculus of Variations, PDEs and Modeling. Birkhäuser Basel, 1 edition, 2015. URL https://www.math.u-psud.fr/~filippo/OTAM-cvgmt.pdf.
- “zero-shot" super-resolution using deep internal learning. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018.
- Richard Sinkhorn. A Relationship Between Arbitrary Positive Matrices and Doubly Stochastic Matrices. The Annals of Mathematical Statistics, 35(2):876 – 879, 1964. doi: 10.1214/aoms/1177703591. URL https://doi.org/10.1214/aoms/1177703591.
- Convolutional wasserstein distances: Efficient optimal transportation on geometric domains. ACM Trans. Graph., 34(4), jul 2015. ISSN 0730-0301. doi: 10.1145/2766963. URL https://doi.org/10.1145/2766963.
- Enhanced deeponet for modeling partial differential operators considering multiple input functions, 2022. URL https://arxiv.org/abs/2202.08942.
- Cédric Villani. Topics in optimal transportation. Graduate studies in mathematics. American mathematical society, Providence, Rhode Island, 2003. ISBN 0-8218-3312-X.
- Learning the solution operator of parametric partial differential equations with physics-informed deeponets. CoRR, abs/2103.10974, 2021. URL https://arxiv.org/abs/2103.10974.
- Gradient-enhanced physics-informed neural networks for forward and inverse pde problems. Computer Methods in Applied Mechanics and Engineering, 393:114823, 2022. ISSN 0045-7825. doi: https://doi.org/10.1016/j.cma.2022.114823. URL https://www.sciencedirect.com/science/article/pii/S0045782522001438.