Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
103 tokens/sec
GPT-4o
11 tokens/sec
Gemini 2.5 Pro Pro
50 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
3 tokens/sec
DeepSeek R1 via Azure Pro
33 tokens/sec
2000 character limit reached

A Mathematical Guide to Operator Learning (2312.14688v1)

Published 22 Dec 2023 in math.NA, cs.AI, cs.LG, and cs.NA

Abstract: Operator learning aims to discover properties of an underlying dynamical system or partial differential equation (PDE) from data. Here, we present a step-by-step guide to operator learning. We explain the types of problems and PDEs amenable to operator learning, discuss various neural network architectures, and explain how to employ numerical PDE solvers effectively. We also give advice on how to create and manage training data and conduct optimization. We offer intuition behind the various neural network architectures employed in operator learning by motivating them from the point-of-view of numerical linear algebra.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (157)
  1. Tensorflow: A system for large-scale machine learning. In 12th USENIX Symposium on Operating Systems Design and Implementation, pages 265–283, 2016.
  2. R. J. Adler. The geometry of random fields. SIAM, 2010.
  3. A convergence theory for deep learning via over-parameterization. In International Conference on Machine Learning, pages 242–252, 2019.
  4. The FEniCS project version 1.5. Arch. Numer. Softw., 3(100), 2015.
  5. Unified form language: A domain-specific language for weak formulations of partial differential equations. ACM Trans. Math. Softw., 40(2):1–37, 2014.
  6. Augmenting Deep Residual Surrogates with Fourier Neural Operators for Rapid Two-Phase Flow and Transport Simulations. SPE J., pages 1–22, 2023.
  7. Solving inverse problems using data-driven models. Acta Numer., 28:1–174, 2019.
  8. PETSc Users Manual. Argonne National Laboratory, 2023.
  9. Are Neural Operators Really Neural Operators? Frame Theory Meets Operator Learning. arXiv preprint arXiv:2305.19913, 2023.
  10. M. Bebendorf and W. Hackbusch. Existence of ℋℋ\mathcal{H}caligraphic_H-matrix approximants to the inverse FE-matrix of elliptic operators with L∞superscript𝐿L^{\infty}italic_L start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT-coefficients. Numer. Math., 95(1):1–28, 2003.
  11. Out-of-distributional risk bounds for neural operators with applications to the Helmholtz equation. arXiv preprint arXiv:2301.11509, 2023.
  12. Model reduction and neural networks for parametric PDEs. SMAI J. Comput. Math., 7:121–157, 2021.
  13. N. Boullé and A. Townsend. A generalization of the randomized singular value decomposition. In International Conference on Learning Representations, 2022.
  14. N. Boullé and A. Townsend. Learning elliptic partial differential equations with randomized linear algebra. Found. Comput. Math., 23(2):709–739, 2023.
  15. Rational neural networks. In Advances in Neural Information Processing Systems, volume 33, pages 14243–14253, 2020.
  16. Data-driven discovery of Green’s functions with human-understandable deep learning. Sci. Rep., 12(1):4824, 2022a.
  17. Learning Green’s functions associated with time-dependent partial differential equations. J. Mach. Learn. Res., 23(218):1–34, 2022b.
  18. Elliptic PDE learning is provably data-efficient. Proc. Natl. Acad. Sci. USA, 120(39):e2303904120, 2023.
  19. Geometric deep learning: Grids, groups, graphs, geodesics, and gauges. arXiv preprint arXiv:2104.13478, 2021.
  20. Language models are few-shot learners. Advances in Neural Information Processing Systems, 33:1877–1901, 2020.
  21. Accurate, high-order representation of complex three-dimensional surfaces via Fourier continuation analysis. J. Comput. Phys., 227(2):1094–1125, 2007.
  22. Discovering governing equations from data by sparse identification of nonlinear dynamical systems. Proc. Natl. Acad. Sci. U.S.A., 113(15):3932–3937, 2016.
  23. A limited memory algorithm for bound constrained optimization. SIAM J. Sci. Comput., 16(5):1190–1208, 1995.
  24. Y. Canzani. Analysis on manifolds via the Laplacian. Harvard University, 2013.
  25. S. Cao. Choose a transformer: Fourier or Galerkin. In Advances in Neural Information Processing Systems, volume 34, pages 24924–24940, 2021.
  26. Data-driven discovery of coordinates and governing equations. Proc. Natl. Acad. Sci. U.S.A., 116(45):22445–22451, 2019.
  27. T. Chen and H. Chen. Universal approximation to nonlinear operators by neural networks with arbitrary activation functions and its application to dynamical systems. IEEE Trans. Neur. Netw., 6(4):911–917, 1995.
  28. On the Green’s matrices of strongly parabolic systems of second order. Indiana Univ. Math. J., 57(4):1633–1677, 2008.
  29. Global estimates for Green’s matrix of second order parabolic systems with application to elliptic systems in two dimensional domains. Potential Anal., 36(2):339–372, 2012.
  30. An algorithm for the machine calculation of complex Fourier series. Math. Comput., 19(90):297–301, 1965.
  31. Scientific machine learning through physics–informed neural networks: Where we are and what’s next. J. Sci. Comput., 92(3):88, 2022.
  32. G. Cybenko. Approximation by superpositions of a sigmoidal function. Math. Control Signals Syst., 2(4):303–314, 1989.
  33. The cost-accuracy trade-off in operator learning with neural networks. arXiv preprint arXiv:2203.13181, 2022.
  34. Convergence rates for learning linear operators from noisy data. SIAM-ASA J. Uncertain. Quantif., 11(2):480–513, 2023.
  35. T. De Ryck and S. Mishra. Generic bounds on the approximation error for physics-informed (and) operator learning. In Advances in Neural Information Processing Systems, volume 35, pages 10945–10958, 2022.
  36. Approximation rates of DeepONets for learning operators arising from advection–diffusion equations. Neur. Netw., 153:411–426, 2022.
  37. Imagenet: A large-scale hierarchical image database. In Conference on Computer Vision and Pattern Recognition, pages 248–255. IEEE, 2009.
  38. R. A. DeVore. Nonlinear approximation. Acta Numer., 7:51–150, 1998.
  39. Neural operator prediction of linear instability waves in high-speed boundary layers. J. Comput. Phys., 474:111793, 2023.
  40. H. Dong and S. Kim. Green’s matrices of second order elliptic systems with measurable coefficients in two dimensional domains. Trans. Amer. Math. Soc., 361(6):3303–3323, 2009.
  41. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.
  42. Chebfun Guide. Pafnuty Publications, 2014. URL http://www.chebfun.org/docs/guide/.
  43. Gradient descent finds global minima of deep neural networks. In International Conference on Machine Learning, pages 1675–1685, 2019.
  44. W. E and B. Yu. The deep Ritz method: a deep learning-based numerical algorithm for solving variational problems. Commun. Math. Stat., 6(1):1–12, 2018.
  45. L. C. Evans. Partial differential equations. American Mathematical Society, 2nd edition, 2010.
  46. V. Fanaskov and I. Oseledets. Spectral neural operators. arXiv preprint arXiv:2205.10573, 2022.
  47. Meta-learning pseudo-differential operators with deep neural networks. J. Comput. Phys., 408:109309, 2020.
  48. Neural message passing for quantum chemistry. In International Conference on Machine Learning, pages 1263–1272, 2017.
  49. DeepGreen: deep learning of Green’s functions for nonlinear boundary value problems. Sci. Rep., 11(1):1–14, 2021.
  50. A physics-informed variational deeponet for predicting crack path in quasi-brittle materials. Comput. Methods Appl. Mech. Eng., 391:114587, 2022.
  51. Physics-informed deep neural operator networks. In Machine Learning in Modeling and Simulation: Methods and Applications, pages 219–254. Springer, 2023.
  52. D. Gottlieb and S. A. Orszag. Numerical analysis of spectral methods: theory and applications. SIAM, 1977.
  53. L. Greengard and V. Rokhlin. A new version of the Fast Multipole Method for the Laplace equation in three dimensions. Acta Numer., 6:229–269, 1997.
  54. M. Grüter and K.-O. Widman. The Green function for uniformly elliptic equations. Manuscripta Math., 37(3):303–342, 1982.
  55. Multiwavelet-based Operator Learning for Differential Equations. In Advances in Neural Information Processing Systems, volume 34, pages 24048–24062, 2021.
  56. Hierarchical matrices based on a weak admissibility criterion. Computing, 73(3):207–243, 2004.
  57. D. Halikias and A. Townsend. Structured matrix recovery from matrix-vector products. Numer. Linear Algebra Appl., page e2531, 2023.
  58. Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions. SIAM Rev., 53(2):217–288, 2011.
  59. Firedrake User Manual. Imperial College London and University of Oxford and Baylor University and University of Washington, 1st edition, 2023.
  60. GNOT: A general neural operator transformer for operator learning. In International Conference on Machine Learning, pages 12556–12569, 2023a.
  61. PINNacle: A Comprehensive Benchmark of Physics-Informed Neural Networks for Solving PDEs. arXiv preprint arXiv:2306.08827, 2023b.
  62. Physics-informed neural networks for multiphysics data assimilation with application to subsurface transport. Adv. Water Resour., 141:103610, 2020.
  63. Denoising diffusion probabilistic models. In Advances in Neural Information Processing Systems, volume 33, pages 6840–6851, 2020.
  64. S. Hofmann and S. Kim. Gaussian estimates for fundamental solutions to certain parabolic systems. Publ. Mat., pages 481–496, 2004.
  65. K. Hornik. Approximation capabilities of multilayer feedforward networks. Neur. Netw., 4(2):251–257, 1991.
  66. T. Hsing and R. Eubank. Theoretical Foundations of Functional Data Analysis, with an Introduction to Linear Operators. John Wiley & Sons, 2015.
  67. A. Iserles. A first course in the numerical analysis of differential equations. Cambridge University Press, 2009.
  68. Neural tangent kernel: Convergence and generalization in neural networks. In Advances in neural information processing systems, volume 31, 2018.
  69. Minimax optimal kernel operator learning via multilevel training. In International Conference on Learning Representations, 2023.
  70. Highly accurate protein structure prediction with AlphaFold. Nature, 596(7873):583–589, 2021.
  71. K. Karhunen. Über lineare methoden in der wahrscheinlichkeitsrechnung. Ann. Acad. Science Fenn., Ser. A. I., 37:3–79, 1946.
  72. Physics-informed machine learning. Nat. Rev. Phys., 3(6):422–440, 2021.
  73. D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. In Proc. 3rd International Conference on Learning Representation, 2015.
  74. Learning operators with coupled attention. J. Mach. Learn. Res., 23(1):9636–9698, 2022.
  75. On universal approximation and error bounds for Fourier Neural Operators. J. Mach. Learn. Res., 22:1–76, 2021.
  76. Neural operator: Learning maps between function spaces with applications to PDEs. J. Mach. Learn. Res., 24(89):1–97, 2023.
  77. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems, volume 25, 2012.
  78. Fourcastnet: Accelerating global high-resolution weather forecasting using adaptive fourier neural operators. In Proceedings of the Platform for Advanced Scientific Computing Conference, pages 1–11, 2023.
  79. Learning skillful medium-range global weather forecasting. Science, page eadi2336, 2023.
  80. Error estimates for DeepONets: A deep learning framework in infinite dimensions. Trans. Math. Appl., 6(1), 2022.
  81. Gradient-based learning applied to document recognition. Proc. IEEE, 86(11):2278–2324, 1998.
  82. Deep learning. Nature, 521(7553):436–444, 2015.
  83. J. Levitt and P.-G. Martinsson. Linear-complexity black-box randomized compression of hierarchically block separable matrices. arXiv preprint arXiv:2205.02990, 2022a.
  84. J. Levitt and P.-G. Martinsson. Randomized compression of rank-structured matrices accelerated with graph coloring. arXiv preprint arXiv:2205.03406, 2022b.
  85. Neural operator: Graph kernel network for partial differential equations. arXiv preprint arXiv:2003.03485, 2020a.
  86. Multipole graph neural operator for parametric partial differential equations. In Advances in Neural Information Processing Systems, volume 33, pages 6755–6766, 2020b.
  87. Fourier neural operator for parametric partial differential equations. In International Conference on Learning Representations, 2021a.
  88. Physics-informed neural operator for learning partial differential equations. arXiv preprint arXiv:2111.03794, 2021b.
  89. Fourier neural operator with learned deformations for PDEs on general geometries. arXiv preprint arXiv:2207.05209, 2022.
  90. Geometry-Informed Neural Operator for Large-Scale 3D PDEs. arXiv preprint arXiv:2309.00583, 2023a.
  91. Long-term predictions of turbulence by implicit U-Net enhanced Fourier neural operator. Phys. Fluids, 35(7), 2023b.
  92. BI-GreenNet: learning Green’s functions by boundary integral network. Commun. Math. Stat., 11(1):103–129, 2023.
  93. Fast construction of hierarchical matrix representation from matrix–vector multiplication. J. Comput. Phys., 230(10):4071–4087, 2011.
  94. M. Loève. Fonctions aleatoire de second ordre. Rev. Sci., 84:195–206, 1946.
  95. I. Loshchilov and F. Hutter. Decoupled weight decay regularization. In International Conference on Learning Representations, 2019.
  96. Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators. Nat. Mach. Intell., 3(3):218–229, 2021a.
  97. DeepXDE: A deep learning library for solving differential equations. SIAM Rev., 63(1):208–228, 2021b.
  98. A comprehensive and fair comparison of two neural operators (with practical extensions) based on fair data. Comput. Methods Appl. Mech. Eng., 393:114778, 2022.
  99. Multi-fidelity prediction of fluid flow based on transfer learning using Fourier neural operator. Phys. Fluids, 35(7), 2023.
  100. PPDONet: Deep Operator Networks for Fast Prediction of Steady-state Solutions in Disk–Planet Systems. Astrophys. J. Lett., 950(2):L12, 2023.
  101. Physics-informed neural networks for high-speed flows. Comput. Methods Appl. Mech. Eng., 360:112789, 2020.
  102. P.-G. Martinsson. A fast randomized algorithm for computing a hierarchically semiseparable representation of a matrix. SIAM J. Matrix Anal. Appl., 32(4):1251–1274, 2011.
  103. Randomized numerical linear algebra: Foundations and algorithms. Acta Numer., 29:403–572, 2020.
  104. Fast training of convolutional networks through FFTs. In International Conference on Learning Representations, 2014.
  105. J. Mercer. Functions of positive and negative type, and their connection with the theory of integral equations. Philos. Trans. Royal Soc. A, 209:415–446, 1909.
  106. S. Minakshisundaram and Å. Pleijel. Some properties of the eigenfunctions of the Laplace-operator on Riemannian manifolds. Canad. J. Math., 1(3):242–256, 1949.
  107. Deeponet-grid-uq: A trustworthy deep operator framework for predicting the power grid’s post-fault trajectories. Neurocomputing, 535:166–182, 2023.
  108. Derivative-informed neural operator: an efficient framework for high-dimensional parametric derivative learning. J. Comput. Phys., 496:112555, 2024.
  109. P. J. Olver. Applications of Lie groups to differential equations. Springer Science & Business Media, 1993a.
  110. P. J. Olver. Applications of Lie groups to differential equations. Springer-Verlag, 2nd edition, 1993b.
  111. A unified framework to enforce, discover, and promote symmetry in machine learning. arXiv preprint arXiv:2311.00212, 2023.
  112. Training language models to follow instructions with human feedback. In Advances in Neural Information Processing Systems, volume 35, pages 27730–27744, 2022.
  113. Pytorch: An imperative style, high-performance deep learning library. Advances in Neural Information Processing Systems, 32, 2019.
  114. Deep Generalized Green’s Functions. arXiv preprint arXiv:2306.02925, 2023.
  115. Attention-enhanced neural network models for turbulence simulation. Phys. Fluids, 34(2), 2022.
  116. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys., 378:686–707, 2019.
  117. Convolutional neural operators. In ICLR 2023 Workshop on Physics for Machine Learning, 2023.
  118. C. E. Rasmussen and C. Williams. Gaussian Processes for Machine Learning. MIT Press, 2006.
  119. Firedrake: automating the finite element method by composing abstractions. ACM Trans. Math. Softw., 43(3):1–27, 2016.
  120. Neural conservation laws: A divergence-free perspective. In Advances in Neural Information Processing Systems, volume 35, pages 38075–38088, 2022.
  121. Multivariate integration and approximation for random fields satisfying sacks-ylvisaker conditions. Ann. Appl. Probab., pages 518–540, 1995.
  122. The graph neural network model. IEEE Trans. Neural Netw., 20(1):61–80, 2008.
  123. F. Schäfer and H. Owhadi. Sparse recovery of elliptic solvers from matrix-vector products. arXiv preprint arXiv:2110.05351, 2021.
  124. Compression, inversion, and approximate PCA of dense kernel matrices at near-linear computational complexity. Multiscale Model. Simul., 19(2):688–730, 2021.
  125. M. Schmidt and H. Lipson. Distilling free-form natural laws from experimental data. Science, 324(5923):81–85, 2009.
  126. GPTIPS: an open source genetic programming toolbox for multigene symbolic regression. In Proceedings of the International multiconference of engineers and computer scientists, volume 1, pages 77–80. Citeseer, 2010.
  127. J. Sirignano and K. Spiliopoulos. DGM: A deep learning algorithm for solving partial differential equations. J. Comput. Phys., 375:1339–1364, 2018.
  128. Deep unsupervised learning using nonequilibrium thermodynamics. In International Conference on Machine Learning, pages 2256–2265, 2015.
  129. Sobolev training for physics informed neural networks. arXiv preprint arXiv:2101.08932, 2021.
  130. Score-based generative modeling through stochastic differential equations. In International Conference on Learning Representations, 2021.
  131. M. L. Stein. Interpolation of spatial data: some theory for kriging. Springer Science & Business Media, 1999.
  132. G. Stepaniants. Learning partial differential equations in reproducing kernel Hilbert spaces. J. Mach. Learn. Res., 24(86):1–72, 2023.
  133. A. M. Stuart. Inverse problems: a Bayesian perspective. Acta Numer., 19:451–559, 2010.
  134. E. Süli and D. F. Mayers. An Introduction to Numerical Analysis. Cambridge University Press, 2003.
  135. BINN: A deep learning approach for computational mechanics problems based on boundary integral equations. Comput. Methods Appl. Mech. Eng., 410:116012, 2023.
  136. L. N. Trefethen. Spectral methods in MATLAB. SIAM, 2000.
  137. T. Tripura and S. Chakraborty. Wavelet neural operator: a neural operator for parametric partial differential equations. arXiv preprint arXiv:2205.02191, 2022.
  138. S.-M. Udrescu and M. Tegmark. AI Feynman: A physics-inspired method for symbolic regression. Sci. Adv., 6(16):eaay2631, 2020.
  139. AI Feynman 2.0: Pareto-optimal symbolic regression exploiting graph modularity. In Advances in Neural Information Processing Systems, volume 33, pages 4860–4871, 2020.
  140. Attention is all you need. In Advances in Neural Information Processing Systems, volume 30, 2017.
  141. S. Venturi and T. Casey. Svd perspectives for augmenting deeponet flexibility and interpretability. Comput. Methods Appl. Mech. Eng., 403:115718, 2023.
  142. On the eigenvector bias of Fourier feature networks: From regression to solving multi-scale PDEs with physics-informed neural networks. Comput. Methods Appl. Mech. Eng., 384:113938, 2021a.
  143. Learning the solution operator of parametric partial differential equations with physics-informed DeepONets. Sci. Adv., 7(40):eabi8605, 2021b.
  144. Improved architectures and training algorithms for deep operator networks. J. Sci. Comput., 92(2):35, 2022a.
  145. When and why PINNs fail to train: A neural tangent kernel perspective. J. Comput. Phys., 449:110768, 2022b.
  146. An expert’s guide to training physics-informed neural networks. arXiv preprint arXiv:2308.08468, 2023.
  147. H. Weyl. Über die asymptotische verteilung der eigenwerte. Nachr. Ges. Wiss. Göttingen, Math.-Phys. Kl., 1911:110–117, 1911.
  148. A comprehensive survey on graph neural networks. IEEE Trans. Neural Netw. Learn. Syst., 32(1):4–24, 2020.
  149. D. Yarotsky. Error bounds for approximations with deep ReLU networks. Neural Netw., 94:103–114, 2017.
  150. A kernel-independent adaptive fast multipole algorithm in two and three dimensions. J. Comput. Phys., 196(2):591–626, 2004.
  151. Learning deep implicit Fourier neural operators (IFNOs) with applications to heterogeneous material modeling. Comput. Methods Appl. Mech. Eng., 398:115296, 2022.
  152. Tuning frequency bias in neural network training with nonuniform data. In International Conference on Learning Representations, 2023.
  153. RecFNO: a resolution-invariant flow and heat field reconstruction method from sparse observations via Fourier neural operator. Int. J. Therm. Sci., 195:108619, 2024.
  154. Fast sampling of diffusion models via operator learning. In International Conference on Machine Learning, pages 42390–42402, 2023.
  155. Graph neural networks: A review of methods and applications. AI Open, 1:57–81, 2020.
  156. Gaussian regression and optimal finite dimensional linear models. In Neural Networks and Machine Learning. Springer-Verlag, 1998.
  157. Reliable extrapolation of deep neural operators informed by physics or sparse observations. Comput. Methods Appl. Mech. Eng., 412:116064, 2023.
Citations (24)

Summary

We haven't generated a summary for this paper yet.