Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Operator Learning: Algorithms and Analysis (2402.15715v1)

Published 24 Feb 2024 in cs.LG, cs.NA, and math.NA

Abstract: Operator learning refers to the application of ideas from machine learning to approximate (typically nonlinear) operators mapping between Banach spaces of functions. Such operators often arise from physical models expressed in terms of partial differential equations (PDEs). In this context, such approximate operators hold great potential as efficient surrogate models to complement traditional numerical methods in many-query tasks. Being data-driven, they also enable model discovery when a mathematical description in terms of a PDE is not available. This review focuses primarily on neural operators, built on the success of deep neural networks in the approximation of functions defined on finite dimensional Euclidean spaces. Empirically, neural operators have shown success in a variety of applications, but our theoretical understanding remains incomplete. This review article summarizes recent progress and the current state of our theoretical understanding of neural operators, focusing on an approximation theoretic point of view.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (149)
  1. A new approach to collaborative filtering: Operator estimation with spectral regularization. The Journal of Machine Learning Research, 10:803––826, 2009.
  2. A general approximation lower bound in Lpsuperscript𝐿𝑝L^{p}italic_L start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT norm, with applications to feed-forward neural networks. In Advances in Neural Information Processing Systems, 2022.
  3. Near-optimal learning of Banach-valued, high-dimensional functions via deep neural networks. arXiv preprint arXiv:2211.12633, 2022.
  4. On efficient algorithms for computing near-best polynomial approximations to high-dimensional, Hilbert-valued functions from limited samples. arXiv preprint arXiv:2203.13908, 2022.
  5. Sparse polynomial approximation of high-dimensional functions, volume 25. SIAM, 2022.
  6. Optimal approximation of infinite-dimensional holomorphic functions. arXiv preprint arXiv:2305.18642, 2023.
  7. Optimal approximation of infinite-dimensional holomorphic functions ii: recovery from iid pointwise samples. arXiv preprint arXiv:2310.16940, 2023.
  8. F. Bach. Breaking the curse of dimensionality with convex neural networks. The Journal of Machine Learning Research, 18(1):629–681, 2017.
  9. M. Bachmayr and A. Cohen. Kolmogorov widths and low-rank approximations of parametric elliptic PDEs. Mathematics of Computation, 86(304):701–724, 2017.
  10. A. R. Barron. Universal approximation bounds for superpositions of a sigmoidal function. IEEE Transactions on Information theory, 39(3):930–945, 1993.
  11. Are neural operators really neural operators? Frame theory meets operator learning. arXiv preprint arXiv:2305.19913, 2023.
  12. Kernel methods are competitive for operator learning. Journal of Computational Physics, 496:112549, 2024.
  13. Out-of-distributional risk bounds for neural operators with applications to the Helmholtz equation. arXiv preprint arXiv:2301.11509, 2023.
  14. Model reduction and neural networks for parametric PDEs. The SMAI Journal of Computational Mathematics, 7:121–157, 2021.
  15. Optimal approximation with sparsely connected deep neural networks. SIAM Journal on Mathematics of Data Science, 1(1):8–45, 2019.
  16. Spherical fourier neural operators: learning stable dynamics on the sphere. In Proceedings of the 40th International Conference on Machine Learning, 2023.
  17. Data-driven discovery of Green’s functions with human-understandable deep learning. Scientific reports, 12(1):4824, 2022.
  18. Learning Green’s functions associated with time-dependent partial differential equations. The Journal of Machine Learning Research, 23(1):9797–9830, 2022.
  19. N. Boullé and A. Townsend. Learning elliptic partial differential equations with randomized linear algebra. Foundations of Computational Mathematics, 23(2):709–739, 2023.
  20. Machine learning for partial differential equations. arXiv preprint arXiv:2303.17078, 2023.
  21. J. Castro. The Kolmogorov infinite dimensional equation in a Hilbert space via deep learning methods. Journal of Mathematical Analysis and Applications, 527(2):127413, 2023.
  22. The Calderón’s problem via DeepONets. arXiv preprint arXiv:2212.08941, 2022.
  23. Deep operator learning lessens the curse of dimensionality for pdes. arXiv preprint arXiv:2301.12227, 2023.
  24. T. Chen and H. Chen. Universal approximation to nonlinear operators by neural networks with arbitrary activation functions and its application to dynamical systems. IEEE Transactions on Neural Networks, 6(4):911–917, 1995.
  25. Sparse adaptive taylor approximation algorithms for parametric and stochastic elliptic PDEs. ESAIM: Mathematical Modelling and Numerical Analysis, 47(1):253–280, 2013.
  26. Breaking the curse of dimensionality in sparse polynomial approximation of parametric PDEs. Journal de Mathématiques Pures et Appliquées, 103(2):400–428, 2015.
  27. Nonlinear reduced models for state and parameter estimation. SIAM/ASA Journal on Uncertainty Quantification, 10(1):227–267, 2022.
  28. A. Cohen and R. DeVore. Approximation of high-dimensional parametric PDEs. Acta Numerica, 24:1–159, 2015.
  29. Convergence rates of best n-term galerkin approximations for a class of elliptic SPDEs. Foundations of Computational Mathematics, 10(6):615–646, 2010.
  30. Analytic regularity and polynomial approximation of parametric and stochastic elliptic PDEs. Analysis and Applications, 9(01):11–47, 2011.
  31. M. J. Colbrook and A. Townsend. Rigorous data-driven computation of spectral properties of Koopman operators for dynamical systems. Communications on Pure and Applied Mathematics, 77(1):221–283, 2024.
  32. MCMC methods for functions: Modifying old algorithms to make them faster. Statistical Science, 28, 02 2012.
  33. C. Crambes and A. Mas. Asymptotics of prediction in functional linear regression with functional outputs. Bernoulli, 19(5B):2627 – 2651, 2013.
  34. G. Cybenko. Approximation by superpositions of a sigmoidal function. Mathematics of control, signals and systems, 2(4):303–314, 1989.
  35. The cost-accuracy trade-off in operator learning with neural networks. Journal of Machine Learning, 1:3:299–341, 2022.
  36. Convergence rates for learning linear operators from noisy data. SIAM/ASA Journal on Uncertainty Quantification, 11(2):480–513, 2023.
  37. Convergence rate of DeepONets for learning operators arising from advection-diffusion equations. arXiv preprint arXiv:2102.10621, 2021.
  38. R. A. DeVore. Nonlinear approximation. Acta Numerica, 7:51–150, 1998.
  39. Optimal nonlinear approximation. Manuscripta mathematica, 63:469–478, 1989.
  40. Constructive approximation, volume 303. Springer Science & Business Media, 1993.
  41. The Barron space and the flow-induced function spaces for neural network models. Constructive Approximation, 55(1):369–406, 2022.
  42. W. E and S. Wojtowytsch. Representation formulas and pointwise properties for Barron functions. Calculus of Variations and Partial Differential Equations, 61(2):1–37, 2022.
  43. P. Enflo. A counterexample to the approximation problem in Banach spaces. Acta Mathematica, 130:309–316, 1973.
  44. L. C. Evans. Partial differential equations, volume 19. American Mathematical Soc., 2010.
  45. Approximation bounds for convolutional neural networks in operator learning. Neural Networks, 161:129–141, 2023.
  46. Globally injective and bijective neural operators. In Thirty-eight Conference on Neural Information Processing Systems, 2024.
  47. Designing universal causal deep learning models: The case of infinite-dimensional dynamical systems from stochastic analysis. arXiv preprint arXiv:2210.13300, 2022.
  48. Deepgreen: Deep learning of Green’s functions for nonlinear boundary value problems. arXiv preprint arXiv:2101.07206, 2020.
  49. Deep Learning. MIT press, 2016.
  50. J. K. Gupta and J. Brandstetter. Towards multi-spatiotemporal-scale generalized PDE modeling. arXiv preprint arXiv:2209.15616, 2022.
  51. Spectral gaps for a Metropolis–Hastings algorithm in infinite dimensions. The Annals of Applied Probability, 24(6):2455–2490, 2014.
  52. Neural and GPC operator surrogates: Construction and expression rate bounds. arXiv preprint arXiv:2207.04950, 2022.
  53. J. S. Hesthaven and S. Ubbiali. Non-intrusive reduced order modeling of nonlinear problems using neural networks. Journal of Computational Physics, 363:55–78, 2018.
  54. Optimization with PDE constraints, volume 23. Springer Science & Business Media, 2008.
  55. Multilayer feedforward networks are universal approximators. Neural networks, 2(5):359–366, 1989.
  56. N. Hua and W. Lu. Basis operator network: A neural network-based model for learning nonlinear operators via neural basis. Neural Networks, 164:21–37, 2023.
  57. An operator learning perspective on parameter-to-observable maps. arXiv preprint arXiv:2402.06031, 2024.
  58. Learning constitutive relations from indirect observations using deep neural networks. Journal of Computational Physics, 416:109491, 2020.
  59. Minimax optimal kernel operator learning via multilevel training. In Eleventh International Conference on Learning Representations, 2022.
  60. Mionet: Learning multiple-input operators via tensor product. SIAM Journal on Scientific Computing, 44(6):A3490–A3514, 2022.
  61. J. Kaipio and E. Somersalo. Statistical and Computational Inverse Problems, volume 160. Springer Science & Business Media, 2006.
  62. T. Kim and M. Kang. Bounding the rademacher complexity of Fourier neural operator. arXiv preprint arXiv:2209.05150, 2022.
  63. I. Klebanov and T. Sullivan. Aperiodic table of modes and maximum a posteriori estimators. arXiv preprint arXiv:2306.16278, 2023.
  64. Y. Korolev. Two-layer neural networks with values in a banach space. SIAM Journal on Mathematical Analysis, 54(6):6358–6389, 2022.
  65. Learning dynamical systems via Koopman operator regression in reproducing kernel hilbert spaces. In Thirty-sixth Annual Conference on Neural Information Processing Systems, 2022.
  66. N. B. Kovachki. Machine Learning and Scientific Computing. PhD thesis, California Institute of Technology, 2022.
  67. On universal approximation and error bounds for Fourier neural operators. Journal of Machine Learning Research, 22(1), 2021.
  68. Neural operator: Learning maps between function spaces with applications to PDEs. Journal of Machine Learning Research, 24(89), 2023.
  69. Learning nonlinear reduced models from data with operator inference. Annual Review of Fluid Mechanics, 56(1):521–548, 2024.
  70. A theoretical analysis of deep neural networks and parametric PDEs. Constructive Approximation, 55(1):73–125, 2022.
  71. Learning skillful medium-range global weather forecasting. Science, 382(6677):1416–1421, 2023.
  72. S. Lanthaler. Operator learning with PCA-Net: Upper and lower complexity bounds. Journal of Machine Learning Research, 24(318), 2023.
  73. The nonlocal neural operator: Universal approximation. arXiv preprint arXiv:2304.13221, 2023.
  74. Error estimates for DeepONets: A deep learning framework in infinite dimensions. Transactions of Mathematics and Its Applications, 6(1), 2022.
  75. Nonlinear reconstruction for operator learning of PDEs with discontinuities. In Eleventh International Conference on Learning Representations, 2023.
  76. S. Lanthaler and N. H. Nelsen. Error bounds for learning with vector-valued random features. In Thirty-seventh Conference on Neural Information Processing Systems, 2023.
  77. Neural oscillators are universal. In Thirty-seventh Conference on Neural Information Processing Systems, 2023.
  78. S. Lanthaler and A. M. Stuart. The curse of dimensionality in operator learning. arXiv preprint arXiv:2306.15924, 2023.
  79. K. Lee and K. T. Carlberg. Model reduction of dynamical systems on nonlinear manifolds using deep convolutional autoencoders. Journal of Computational Physics, 404:108973, 2020.
  80. Solving parametric partial differential equations with deep rectified quadratic unit neural networks. Journal of Scientific Computing, 93(3):80, 2022.
  81. Extended dynamic mode decomposition with dictionary learning: A data-driven adaptive spectral decomposition of the Koopman operator. Chaos: An Interdisciplinary Journal of Nonlinear Science, 27(10):103111, 2017.
  82. Neural operator: Graph kernel network for partial differential equations. arXiv preprint arXiv:2003.03485, 2020.
  83. Fourier neural operator for parametric partial differential equations. In Ninth International Conference on Learning Representations, 2021.
  84. Multipole graph neural operator for parametric partial differential equations. In Thirty-fourth Annual Conference on Neural Information Processing Systems, 2020.
  85. Reproducing activation function for deep learning. arXiv preprint arXiv:2101.04844, 2021.
  86. J. Lindenstrauss and L. Tzafriri. Classical Banach Spaces I: Sequence Spaces. Ergebnisse der Mathematik und ihrer Grenzgebiete. 2. Folge. Springer Berlin Heidelberg, 2013.
  87. Reynolds averaged turbulence modelling using deep neural networks with embedded invariance. Journal of Fluid Mechanics, 807:155–166, 2016.
  88. PDE-refiner: Achieving accurate long rollouts with neural PDE solvers. In Thirty-seventh Annual Conference on Neural Information Processing Systems, 2024.
  89. Deep functional maps: Structured prediction for dense shape correspondence. In Proceedings of the IEEE international conference on computer vision, pages 5659–5667, 2017.
  90. Deep nonparametric estimation of operators between infinite dimensional spaces. Journal of Machine Learning Research, 25(24):1–67, 2024.
  91. A review of artificial neural networks in the constitutive modeling of composite materials. Composites Part B: Engineering, 224:109152, 2021.
  92. Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators. Nature Machine Intelligence, 3(3):218–229, 2021.
  93. Efficient PDE-constrained optimization under high-dimensional uncertainty using derivative-informed neural operators. arXiv preprint arXiv:2305.20053, 2023.
  94. C. Marcati and C. Schwab. Exponential convergence of deep operator networks for elliptic partial differential equations. SIAM Journal on Numerical Analysis, 61(3):1513–1545, 2023.
  95. H. Mhaskar. Local approximation of operators. Applied and Computational Harmonic Analysis, 64:194–228, 2023.
  96. H. Mhaskar and T. Poggio. Function approximation by deep networks. Communications on Pure and Applied Analysis, 19(8):4085–4095, 2020.
  97. H. N. Mhaskar and N. Hahm. Neural networks for functional approximation and system identification. Neural Computation, 9(1):143–159, 1997.
  98. Learning linear operators: Infinite-dimensional regression as a well-behaved non-compact inverse problem. arXiv preprint arXiv:2211.08875, 2022.
  99. Deep dynamical modeling and control of unsteady fluid flows. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, 2018.
  100. A. Mukherjee and A. Roy. Size lowerbounds for deep operator networks. arXiv preprint arXiv:2308.06338, 2023.
  101. The random feature model for input-output maps between Banach spaces. SIAM Journal on Scientific Computing, 43(5):A3212–A3243, 2021.
  102. Derivative-informed neural operator: An efficient framework for high-dimensional parametric derivative learning. Journal of Computational Physics, 496:112555, 2024.
  103. Exponential ReLU DNN expression of holomorphic maps in high dimension. Constructive Approximation, 55(1):537–582, 2022.
  104. Functional maps: A flexible representation of maps between shapes. ACM Trans. Graph., 31(4), 2012.
  105. Variationally mimetic operator networks. Computer Methods in Applied Mechanics and Engineering, 419:116536, 2024.
  106. R. G. Patel and O. Desjardins. Nonlinear integro-differential operator regression with neural networks. arXiv preprint arXiv:1810.08552, 2018.
  107. A physics-informed operator regression framework for extracting data-driven continuum models. Computer Methods in Applied Mechanics and Engineering, 373:113500, 2021.
  108. B. Peherstorfer and K. Willcox. Data-driven operator inference for nonintrusive projection-based model reduction. Computer Methods in Applied Mechanics and Engineering, 306:196–215, 2016.
  109. A. Pinkus. Approximation theory of the mlp model in neural networks. Acta Numerica, 8:143–195, 1999.
  110. Variable-input deep operator networks. arXiv preprint arXiv:2205.11404, 2022.
  111. Reduced operator inference for nonlinear partial differential equations. SIAM Journal on Scientific Computing, to appear, 2022.
  112. Lift & Learn: Physics-informed machine learning for large-scale nonlinear dynamical systems. Physica D: Nonlinear Phenomena, 406:132401, 2020.
  113. M. Raghu and E. Schmidt. A survey of deep learning for scientific discovery. arXiv preprint arXiv:2003.11755, 2020.
  114. A. Rahimi and B. Recht. Random features for large-scale kernel machines. In Twenty-First Annual Conference on Neural Information Processing Systems, 2007.
  115. U-NO: U-shaped neural operators. Transactions on Machine Learning Research, 2023.
  116. J. O. Ramsay and C. Dalzell. Some tools for functional data analysis. Journal of the Royal Statistical Society Series B: Statistical Methodology, 53(3):539–561, 1991.
  117. Convolutional neural operators. In ICLR 2023 Workshop on Physics for Machine Learning, 2023.
  118. Convolutional neural operators for robust and accurate learning of PDEs. In Thirty-seventh Conference on Neural Information Processing Systems, 2023.
  119. On learning with integral operators. The Journal of Machine Learning Research, 11:905–934, 2010.
  120. T. D. Ryck and S. Mishra. Generic bounds on the approximation error for physics-informed (and) operator learning. In Thirty-sixth Annual Conference on Neural Information Processing Systems, 2022.
  121. Design and analysis of computer experiments. Statistical Science, 4(4):409 – 423, 1989.
  122. J. Schmidt-Hieber. Nonparametric regression using deep neural networks with ReLU activation function. The Annals of Statistics, 48(4):1875 – 1897, 2020.
  123. Deep operator network approximation rates for Lipschitz operators. arXiv preprint arXiv:2307.09835, 2023.
  124. C. Schwab and J. Zech. Deep learning in high dimension: Neural network expression rates for generalized polynomial chaos expansions in UQ. Analysis and Applications, 17(01):19–55, 2019.
  125. C. Schwab and J. Zech. Deep learning in high dimension: Neural network expression rates for analytic functions in L2⁢(ℝd,γd)superscript𝐿2superscriptℝ𝑑subscript𝛾𝑑{L}^{2}(\mathbb{R}^{d},\gamma_{d})italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT , italic_γ start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ). SIAM/ASA Journal on Uncertainty Quantification, 11(1):199–234, 2023.
  126. NOMAD: Nonlinear manifold decoders for operator learning. In Thirty-sixth Annual Conference on Neural Information Processing Systems, 2022.
  127. Diffusionnet: Discretization agnostic learning on surfaces. ACM Transactions on Graphics, 2022.
  128. Nonlinear approximation via compositions. Neural Networks, 119:74–84, 2019.
  129. Neural network approximation: Three hidden layers are enough. Neural Networks, 141:160–173, 2021.
  130. J. W. Siegel. Optimal approximation rates for deep ReLU neural networks on Sobolev and Besov spaces. Journal of Machine Learning Research, 24(357):1–52, 2023.
  131. A Helmholtz equation solver using unsupervised learning: Application to transcranial ultrasound. Journal of Computational Physics, 441:110430, 2021.
  132. G. Stepaniants. Learning partial differential equations in reproducing kernel hilbert spaces. Journal of Machine Learning Research, 24(86):1–72, 2023.
  133. A. M. Stuart. Inverse problems: A Bayesian perspective. Acta Numerica, 19:451–559, 2010.
  134. Projection-based model reduction: Formulations for physics-based machine learning. Computers & Fluids, 179:704–717, 2019.
  135. T. Tripura and S. Chakraborty. Wavelet neural operator for solving parametric partial differential equations in computational mechanics problems. Computer Methods in Applied Mechanics and Engineering, 404:115783, 2023.
  136. Attention is all you need. In Thirty-first Annual Conference on Neural Information Processing Systems, 2017.
  137. C. Wang and A. Townsend. Operator learning for hyperbolic partial differential equations. arXiv preprint arXiv:2312.17489, 2023.
  138. Functional linear regression with mixed predictors. The Journal of Machine Learning Research, 23(1):12181–12274, 2022.
  139. Physics-informed machine learning approach for reconstructing Reynolds stress modeling discrepancies based on DNS data. Phys. Rev. Fluids, 2:034603, 2017.
  140. Physics-informed machine learning approach for augmenting turbulence models: A comprehensive framework. Phys. Rev. Fluids, 3:074602, 2018.
  141. Learning constitutive relations using symmetric positive definite neural networks. Journal of Computational Physics, 428:110072, 2021.
  142. D. Yarotsky. Error bounds for approximations with deep ReLU networks. Neural Networks, 94:103–114, 2017.
  143. D. Yarotsky. Elementary superexpressive activations. In Thirty-eighth International Conference on Machine Learning, 2021.
  144. D. Yarotsky and A. Zhevnerchuk. The phase diagram of approximation rates for deep neural networks. In Thirty-fourth Annual Conference on Neural Information Processing Systems, 2020.
  145. Learning deep neural network representations for Koopman operators of nonlinear dynamical systems. In 2019 American Control Conference (ACC), 2019.
  146. Syncspeccnn: Synchronized spectral CNN for 3D shape segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, 2017.
  147. Gradient-based dimension reduction of multivariate vector-valued functions. SIAM Journal on Scientific Computing, 42(1):A534–A558, 2020.
  148. Neural network architecture beyond width and depth. In Thirty-sixth Annual Conference on Neural Information Processing Systems, 2022.
  149. BelNet: Basis enhanced learning, a mesh-free neural operator. Proceedings of the Royal Society A, 479, 2023.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Nikola B. Kovachki (12 papers)
  2. Samuel Lanthaler (22 papers)
  3. Andrew M. Stuart (86 papers)
Citations (17)

Summary

We haven't generated a summary for this paper yet.