Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
153 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

PDE Generalization of In-Context Operator Networks: A Study on 1D Scalar Nonlinear Conservation Laws (2401.07364v2)

Published 14 Jan 2024 in cs.LG, cs.AI, cs.NA, and math.NA

Abstract: Can we build a single large model for a wide range of PDE-related scientific learning tasks? Can this model generalize to new PDEs, even of new forms, without any fine-tuning? In-context operator learning and the corresponding model In-Context Operator Networks (ICON) represent an initial exploration of these questions. The capability of ICON regarding the first question has been demonstrated previously. In this paper, we present a detailed methodology for solving PDE problems with ICON, and show how a single ICON model can make forward and reverse predictions for different equations with different strides, provided with appropriately designed data prompts. We show the positive evidence to the second question, i.e., ICON can generalize well to some PDEs with new forms without any fine-tuning. This is exemplified through a study on 1D scalar nonlinear conservation laws, a family of PDEs with temporal evolution. We also show how to broaden the range of problems that an ICON model can address, by transforming functions and equations to ICON's capability scope. We believe that the progress in this paper is a significant step towards the goal of training a foundation model for PDE-related tasks under the in-context operator learning framework.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (58)
  1. In-context operator learning with data prompts for differential equation problems. Proceedings of the National Academy of Sciences, 120(39):e2310142120, 2023.
  2. Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks, 9(5):987–1000, 1998.
  3. Deep learning-based numerical methods for high-dimensional parabolic partial differential equations and backward stochastic differential equations. Communications in mathematics and statistics, 5(4):349–380, 2017.
  4. Solving high-dimensional partial differential equations using deep learning. Proc. Natl. Acad. Sci. USA, 115(34):8505–8510, 2018.
  5. DGM: a deep learning algorithm for solving partial differential equations. J. Comput. Phys., 375:1339–1364, 2018.
  6. The deep Ritz method: a deep learning-based numerical algorithm for solving variational problems. Commun. Math. Stat., 6(1):1–12, 2018.
  7. Physics-informed neural networks: a deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys., 378:686–707, 2019.
  8. Weak adversarial networks for high-dimensional partial differential equations. J. Comput. Phys., 411:109409, 14, 2020.
  9. Potential flow generator with L2subscript𝐿2{L}_{2}italic_L start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT optimal transport regularity for generative models. IEEE Transactions on Neural Networks and Learning Systems, 33(2):528–538, 2020.
  10. A machine learning framework for solving high-dimensional mean field game and mean field control problems. Proc. Natl. Acad. Sci. USA, 117(17):9183–9193, 2020.
  11. Alternating the population and control neural networks to solve high-dimensional stochastic mean-field games. Proc. Natl. Acad. Sci. USA, 118(31):Paper No. e2024713118, 10, 2021.
  12. Universal approximation to nonlinear operators by neural networks with arbitrary activation functions and its application to dynamical systems. IEEE transactions on neural networks, 6(4):911–917, 1995.
  13. Approximation capability to functions of several variables, nonlinear functionals, and operators by radial basis function neural networks. IEEE Transactions on Neural Networks, 6(4):904–910, 1995.
  14. Solving parametric PDE problems with artificial neural networks. European Journal of Applied Mathematics, 32(3):421–435, 2021.
  15. Bayesian deep convolutional encoder–decoder networks for surrogate modeling and uncertainty quantification. Journal of Computational Physics, 366:415–447, 2018.
  16. PDE-net: Learning PDEs from data. In International conference on machine learning, pages 3208–3216. PMLR, 2018.
  17. Fourier neural operator for parametric partial differential equations. In International Conference on Learning Representations, 2021.
  18. Neural operator: Learning maps between function spaces with applications to PDEs. Journal of Machine Learning Research, 24(89):1–97, 2023.
  19. Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators. Nature machine intelligence, 3(3):218–229, 2021.
  20. Learning the solution operator of parametric partial differential equations with physics-informed DeepONets. Science advances, 7(40):eabi8605, 2021.
  21. Model reduction and neural networks for parametric PDEs. The SMAI journal of computational mathematics, 7:121–157, 2021.
  22. Physics-informed neural operator for learning partial differential equations. arXiv preprint arXiv:2111.03794, 2021.
  23. Learning to simulate complex physics with graph networks. In International conference on machine learning, pages 8459–8468. PMLR, 2020.
  24. Learning mesh-based simulation with graph networks. In International Conference on Learning Representations, 2020.
  25. Message passing neural pde solvers. In International Conference on Learning Representations, 2021.
  26. Machine learning–accelerated computational fluid dynamics. Proceedings of the National Academy of Sciences, 118(21):e2101784118, 2021.
  27. Learning operators with coupled attention. Journal of Machine Learning Research, 23(215):1–63, 2022.
  28. Deep transfer operator learning for partial differential equations under conditional shift. Nature Machine Intelligence, pages 1–10, 2022.
  29. Reliable extrapolation of deep neural operators informed by physics or sparse observations. Computer Methods in Applied Mechanics and Engineering, 412:116064, 2023.
  30. Explaining the physics of transfer learning in data-driven turbulence modeling. PNAS nexus, 2(3):pgad015, 2023.
  31. Transfer learning enhanced physics informed neural network for phase-field modeling of fracture. Theoretical and Applied Fracture Mechanics, 106:102447, 2020.
  32. Transfer learning for deep neural network-based partial differential equations solving. Advances in Aerodynamics, 3(1):1–14, 2021.
  33. A physics-informed deep learning framework for inversion and surrogate modeling in solid mechanics. Computer Methods in Applied Mechanics and Engineering, 379:113741, 2021.
  34. Unsupervised reservoir computing for solving ordinary differential equations. arXiv preprint arXiv:2108.11417, 2021.
  35. Souvik Chakraborty. Transfer learning based multi-fidelity physics informed deep neural network. Journal of Computational Physics, 426:109942, 2021.
  36. One-shot transfer learning of physics-informed neural networks. In ICML 2022 2nd AI for Science Workshop, 2022.
  37. SVD-PINNs: Transfer learning of physics-informed neural networks via singular value decomposition. In 2022 IEEE Symposium Series on Computational Intelligence (SSCI), pages 1443–1450. IEEE, 2022.
  38. Analysis of three-dimensional potential problems in non-homogeneous media with physics-informed deep collocation method using material transfer learning and sensitivity analysis. Engineering with Computers, 38(6):5423–5444, 2022.
  39. Domain adaptation based transfer learning approach for solving PDEs on complex geometries. Engineering with Computers, 38(5):4569–4588, 2022.
  40. Transfer learning based physics-informed neural networks for solving inverse problems in engineering structures under different loading scenarios. Computer Methods in Applied Mechanics and Engineering, 405:115852, 2023.
  41. Mosaic flows: A transferable deep learning framework for solving PDEs on unseen domains. Computer Methods in Applied Mechanics and Engineering, 389:114424, 2022.
  42. Transfer learning enhanced DeepONet for long-time prediction of evolution equations. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 10629–10636, 2023.
  43. Multi-fidelity prediction of fluid flow and temperature field based on transfer learning using Fourier Neural Operator. arXiv preprint arXiv:2304.06972, 2023.
  44. Towards foundation models for scientific machine learning: Characterizing scaling and transfer behavior. arXiv preprint arXiv:2306.00258, 2023.
  45. Fine-tune language models as multi-modal differential equation solvers. arXiv preprint arXiv:2308.05061, 2023.
  46. Prose: Predicting operators and symbolic expressions using multimodal transformers. arXiv preprint arXiv:2309.16816, 2023.
  47. Does in-context operator learning generalize to domain-shifted settings? In The Symbiosis of Deep Learning and Differential Equations III, 2023.
  48. Multiple physics pretraining for physical surrogate models. In NeurIPS 2023 AI for Science Workshop, 2023.
  49. Language models are unsupervised multitask learners. OpenAI blog, 1(8):9, 2019.
  50. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
  51. What can transformers learn in-context? a case study of simple function classes. Advances in Neural Information Processing Systems, 35:30583–30598, 2022.
  52. Sequential modeling enables scalable learning for large vision models. arXiv preprint arXiv:2312.00785, 2023.
  53. Visual instruction tuning. arXiv preprint arXiv:2304.08485, 2023.
  54. Palm-e: An embodied multimodal language model. arXiv preprint arXiv:2303.03378, 2023.
  55. A survey for in-context learning. arXiv preprint arXiv:2301.00234, 2022.
  56. OpenAI. GPT-4 technical report, 2023.
  57. Weighted essentially non-oscillatory schemes. Journal of computational physics, 115(1):200–212, 1994.
  58. Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems, 35:24824–24837, 2022.
Citations (14)

Summary

We haven't generated a summary for this paper yet.