Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
166 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Data Complexity Estimates for Operator Learning (2405.15992v2)

Published 25 May 2024 in cs.LG, cs.NA, and math.NA

Abstract: Operator learning has emerged as a new paradigm for the data-driven approximation of nonlinear operators. Despite its empirical success, the theoretical underpinnings governing the conditions for efficient operator learning remain incomplete. The present work develops theory to study the data complexity of operator learning, complementing existing research on the parametric complexity. We investigate the fundamental question: How many input/output samples are needed in operator learning to achieve a desired accuracy $\epsilon$? This question is addressed from the point of view of $n$-widths, and this work makes two key contributions. The first contribution is to derive lower bounds on $n$-widths for general classes of Lipschitz and Fr\'echet differentiable operators. These bounds rigorously demonstrate a curse of data-complexity'', revealing that learning on such general classes requires a sample size exponential in the inverse of the desired accuracy $\epsilon$. The second contribution of this work is to show thatparametric efficiency'' implies ``data efficiency''; using the Fourier neural operator (FNO) as a case study, we show rigorously that on a narrower class of operators, efficiently approximated by FNO in terms of the number of tunable parameters, efficient operator learning is attainable in data complexity as well. Specifically, we show that if only an algebraically increasing number of tunable parameters is needed to reach a desired approximation accuracy, then an algebraically bounded number of data samples is also sufficient to achieve the same accuracy.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (38)
  1. Sampling complexity of deep approximation spaces, 2023.
  2. Sobolev Spaces. Elsevier Science, 2003.
  3. Optimal approximation of infinite-dimensional holomorphic functions ii: recovery from iid pointwise samples. arXiv preprint arXiv:2310.16940, 2023.
  4. Optimal approximation of infinite-dimensional holomorphic functions. Calcolo, 61(1):12, 2024.
  5. Kolmogorov widths and low-rank approximations of parametric elliptic pdes. Mathematics of Computation, 86(304):701–724, 2017.
  6. C. Bennett and R.C. Sharpley. Interpolation of Operators. Elsevier Science, 1988.
  7. Learning relu networks to high uniform accuracy is intractable. In The Eleventh International Conference on Learning Representations, 2022.
  8. Model Reduction And Neural Networks For Parametric PDEs. The SMAI journal of computational mathematics, 7:121–157, 2021.
  9. Convergence rates of best n-term galerkin approximations for a class of elliptic spdes. Foundations of Computational Mathematics, 10(6):615–646, 2010.
  10. Analytic regularity and polynomial approximation of parametric and stochastic elliptic pde’s. Analysis and Applications, 9(01):11–47, 2011.
  11. On the mathematical foundations of learning. Bulletin of the American Mathematical Society, 39(1):1–49, 2002.
  12. Optimal nonlinear approximation. Manuscripta mathematica, 63:469–478, 1989.
  13. Wavelet compression and nonlinearn-widths. Advances in Computational Mathematics, 1:197–214, 1993.
  14. Approximation bounds for convolutional neural networks in operator learning. Neural Networks, 161:129–141, 2023.
  15. Designing universal causal deep learning models: The case of infinite-dimensional dynamical systems from stochastic analysis, 2023.
  16. Proof of the theory-to-practice gap in deep learning via sampling complexity bounds for neural network approximation spaces. Foundations of Computational Mathematics, pages 1–59, 2023.
  17. Neural and gpc operator surrogates: construction and expression rate bounds, 2022.
  18. Bounding the rademacher complexity of fourier neural operators. Machine Learning, pages 1–32, 2024.
  19. Yury Korolev. Two-layer neural networks with values in a banach space. SIAM Journal on Mathematical Analysis, 54(6):6358–6389, 2022.
  20. On universal approximation and error bounds for fourier neural operators. Journal of Machine Learning Research, 22(290):1–76, 2021.
  21. Samuel Lanthaler. Operator learning with pca-net: upper and lower complexity bounds. Journal of Machine Learning Research, 24(318):1–67, 2023.
  22. Error estimates for DeepONets: A deep learning framework in infinite dimensions. Transactions of Mathematics and Its Applications, 6(1):tnac001, 2022.
  23. Error bounds for learning with vector-valued random features. In Thirty-seventh Conference on Neural Information Processing Systems, 2023.
  24. The parametric complexity of operator learning, 2024.
  25. Fourier neural operator for parametric partial differential equations. ICLR 2021; arXiv:2010.08895, 2020.
  26. Fourier neural operator for parametric partial differential equations. In International Conference on Learning Representations, 2021.
  27. Deep nonparametric estimation of operators between infinite dimensional spaces. Journal of Machine Learning Research, 25(24):1–67, 2024.
  28. Deeponet: Learning nonlinear operators for identifying differential equations based on the universal approximation theorem of operators. arXiv preprint arXiv:1910.03193, 2019.
  29. Exponential convergence of deep operator networks for elliptic partial differential equations. SIAM Journal on Numerical Analysis, 61(3):1513–1545, 2023.
  30. Neural networks for functional approximation and system identification. Neural Computation, 9(1):143–159, 1997.
  31. Size lowerbounds for deep operator networks, 2023.
  32. The random feature model for input-output maps between banach spaces. SIAM Journal on Scientific Computing, 43(5):A3212–A3243, 2021.
  33. Exponential relu dnn expression of holomorphic maps in high dimension. Constructive Approximation, 55(1):537–582, 2022.
  34. Deep operator network approximation rates for lipschitz operators. arXiv preprint arXiv:2307.09835, 2023.
  35. Deep learning in high dimension: Neural network expression rates for generalized polynomial chaos expansions in UQ. Analysis and Applications, 17(01):19–55, 2019.
  36. Deep learning in high dimension: Neural network expression rates for analytic functions in l2⁢(ℝd,γd)superscript𝑙2superscriptℝ𝑑subscript𝛾𝑑l^{2}(\mathbb{R}^{d},\gamma_{d})italic_l start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT , italic_γ start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ). SIAM/ASA Journal on Uncertainty Quantification, 11(1):199–234, 2023.
  37. Jonathan W Siegel. Sharp lower bounds on the manifold widths of sobolev and besov spaces. arXiv preprint arXiv:2402.04407, 2024.
  38. Martin J Wainwright. High-dimensional statistics: A non-asymptotic viewpoint, volume 48. Cambridge university press, 2019.
Citations (4)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets