Towards a Foundation Model for Partial Differential Equations: Multi-Operator Learning and Extrapolation (2404.12355v3)
Abstract: Foundation models, such as LLMs, have demonstrated success in addressing various language and image processing tasks. In this work, we introduce a multi-modal foundation model for scientific problems, named PROSE-PDE. Our model, designed for bi-modality to bi-modality learning, is a multi-operator learning approach which can predict future states of spatiotemporal systems while concurrently learning the underlying governing equations of the physical system. Specifically, we focus on multi-operator learning by training distinct one-dimensional time-dependent nonlinear constant coefficient partial differential equations, with potential applications to many physical applications including physics, geology, and biology. More importantly, we provide three extrapolation studies to demonstrate that PROSE-PDE can generalize physical features through the robust training of multiple operators and that the proposed model can extrapolate to predict PDE solutions whose models or data were unseen during the training. Furthermore, we show through systematic numerical experiments that the utilization of the symbolic modality in our model effectively resolves the well-posedness problems with training multiple operators and thus enhances our model's predictive capabilities.
- Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473, 2014.
- Longformer: The long-document transformer. arXiv preprint arXiv:2004.05150, 2020.
- On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258, 2021.
- Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
- Francois Charton. Linear algebra with transformers. arXiv preprint arXiv:2112.01898, 2022.
- Approximations of continuous functionals by neural networks with application to dynamic systems. IEEE Transactions on Neural networks, 4(6):910–918, 1993.
- Universal approximation to nonlinear operators by neural networks with arbitrary activation functions and its application to dynamical systems. IEEE Transactions on Neural Networks, 6(4):911–917, 1995.
- Transformer-xl: Attentive language models beyond a fixed-length context. arXiv preprint arXiv:1901.02860, 2019.
- Deep symbolic regression for recurrent sequences. arXiv preprint arXiv:2201.04600, 2022.
- Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.
- Neural operator prediction of linear instability waves in high-speed boundary layers. Journal of Computational Physics, 474:111793, 2023.
- Recurrent neural network grammars. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 199–209. Association for Computational Linguistics, 2016.
- Gpt-3: Its nature, scope, limits, and consequences. Minds and Machines, 30:681–694, 2020.
- Alex Graves. Generating sequences with recurrent neural networks. arXiv preprint arXiv:1308.0850, 2013.
- Physically consistent numerical solver for time-dependent fokker-planck equations. Phys. Rev. E, 99:032117, Mar 2019.
- Multifidelity deep operator networks. arXiv preprint arXiv:2204.09157, 2022.
- Mathprompter: Mathematical reasoning using large language models. arXiv preprint arXiv:2303.05398, 2023.
- Finite expression methods for discovering physical laws from data. arXiv preprint arXiv:2305.08342, 2023.
- Fourier-mionet: Fourier-enhanced multiple-input neural operators for multiphase modeling of geological carbon sequestration. arXiv preprint arXiv:2303.04778, 2023.
- Mionet: Learning multiple-input operators via tensor product. SIAM Journal on Scientific Computing, 44(6):A3490–A3514, 2022.
- End-to-end symbolic regression with transformers. In Alice H. Oh, Alekh Agarwal, Danielle Belgrave, and Kyunghyun Cho, editors, Advances in Neural Information Processing Systems, 2022.
- Vilt: Vision-and-language transformer without convolution or region supervision. In International Conference on Machine Learning, pages 5583–5594. PMLR, 2021.
- Nh-pinn: Neural homogenization-based physics-informed neural network for multiscale problems. Journal of Computational Physics, page 111539, 2022.
- Visualbert: A simple and performant baseline for vision and language. arXiv preprint arXiv:1908.03557, 2019.
- Ai choreographer: Music conditioned 3d dance generation with aist++. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 13401–13412, 2021.
- Fourier neural operator with learned deformations for pdes on general geometries. arXiv preprint arXiv:2207.05209, 2022.
- Fourier neural operator with learned deformations for pdes on general geometries. Journal of Machine Learning Research, 24(388):1–26, 2023.
- Fourier neural operator for parametric partial differential equations. arXiv preprint arXiv:2010.08895, 2020.
- Neural operator: Graph kernel network for partial differential equations. arXiv preprint arXiv:2003.03485, 2020.
- Finite expression method for solving high-dimensional partial differential equations. arXiv preprint arXiv:2206.10121, 2022.
- B-deeponet: An enhanced bayesian deeponet for solving noisy parametric pdes using accelerated replica exchange sgld. Journal of Computational Physics, 473:111713, 2023.
- Learning the dynamical response of nonlinear non-autonomous dynamical systems with deep operator neural networks. Engineering Applications of Artificial Intelligence, 125:106689, 2023.
- PROSE: Predicting operators and symbolic expressions using multimodal transformers. arXiv preprint arXiv:2309.16816, 2023.
- Vilbert: Pretraining task-agnostic visiolinguistic representations for vision-and-language tasks. Advances in neural information processing systems, 32, 2019.
- Learning nonlinear operators via deeponet based on the universal approximation theorem of operators. Nature Machine Intelligence, 3(3):218–229, 2021.
- A comprehensive and fair comparison of two neural operators (with practical extensions) based on fair data. Computer Methods in Applied Mechanics and Engineering, 393:114778, 2022.
- Multifidelity deep neural operators for efficient learning of partial differential equations with application to fast inverse design of nanoscale heat transport. Physical Review Research, 4(2):023210, 2022.
- Ppdonet: Deep operator networks for fast prediction of steady-state solutions in disk–planet systems. The Astrophysical Journal Letters, 950(2):L12, 2023.
- Deepm&mnet for hypersonics: Predicting the coupled flow and finite-rate chemistry behind a normal shock using neural-network approximation of operators. Journal of computational physics, 447:110698, 2021.
- Multiple physics pretraining for physical surrogate models. arXiv preprint arXiv:2310.02994, 2023.
- Conformalized-deeponet: A distribution-free framework for uncertainty quantification in deep operator networks. arXiv preprint arXiv:2402.15406, 2024.
- Fourcastnet: A global data-driven high-resolution weather model using adaptive fourier neural operators. arXiv preprint arXiv:2202.11214, 2022.
- H. A. Pogorzelski. Review: Jan lukasiewicz, jerzy slupecki, panstwowe wydawnictwo, remarks on nicod’s axiom and on ”generalizing deduction”. Journal of Symbolic Logic, 30(3):376–377, 1965.
- Improving language understanding by generative pre-training. 2018.
- Language models are unsupervised multitask learners. OpenAI blog, 1(8):9, 2019.
- Hierarchical text-conditional image generation with clip latents. arXiv preprint arXiv:2204.06125, 1(2):3, 2022.
- Zero-shot text-to-image generation. In International conference on machine learning, pages 8821–8831. Pmlr, 2021.
- High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10684–10695, 2022.
- Hayden Schaeffer. Learning partial differential equations via data discovery and sparse optimization. Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, 473(2197):20160446, 2017.
- Sparse model selection via integral terms. Physical Review E, 96(2):023302, 2017.
- Extracting sparse high-dimensional dynamics from limited data. SIAM Journal on Applied Mathematics, 78(6):3279–3295, 2018.
- Ups: Towards foundation models for pde solving via cross-modal adaptation. arXiv preprint arXiv:2403.07187, 2024.
- Videobert: A joint model for video and language representation learning. In Proceedings of the IEEE/CVF international conference on computer vision, pages 7464–7473, 2019.
- Improved semantic representations from tree-structured long short-term memory networks. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 1556–1566. Association for Computational Linguistics, 2015.
- Pdebench: An extensive benchmark for scientific machine learning. Advances in Neural Information Processing Systems, 35:1596–1611, 2022.
- Lxmert: Learning cross-modality encoder representations from transformers. arXiv preprint arXiv:1908.07490, 2019.
- Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971, 2023.
- Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288, 2023.
- Multimodal transformer for unaligned multimodal language sequences. In Proceedings of the conference. Association for Computational Linguistics. Meeting, volume 2019, page 6558. NIH Public Access, 2019.
- Language models don’t always say what they think: unfaithful explanations in chain-of-thought prompting. Advances in Neural Information Processing Systems, 36, 2024.
- Ai feynman: A physics-inspired method for symbolic regression. Science Advances, 6(16):eaay2631, 2020.
- Attention is all you need. Advances in neural information processing systems, 30, 2017.
- U-fno—an enhanced fourier neural operator-based deep-learning model for multiphase flow. Advances in Water Resources, 163:104180, 2022.
- Google’s neural machine translation system: Bridging the gap between human and machine translation, 2016.
- Show, attend and tell: Neural image caption generation with visual attention. In International conference on machine learning, pages 2048–2057. PMLR, 2015.
- Multimodal learning with transformers: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023.
- In-context operator learning with data prompts for differential equation problems. Proceedings of the National Academy of Sciences, 120(39):e2310142120, 2023.
- Prompting in-context operator learning with sensor data, equations, and natural language. arXiv preprint arXiv:2308.05061, 2023.
- Pde generalization of in-context operator networks: A study on 1d scalar nonlinear conservation laws. arXiv preprint arXiv:2401.07364, 2024.
- Pdeformer: Towards a foundation model for one-dimensional partial differential equations. arXiv preprint arXiv:2402.12652, 2024.
- Dimon: Learning solution operators of partial differential equations on a diffeomorphic family of domains. arXiv preprint arXiv:2402.07250, 2024.
- Interaction of ”solitons” in a collisionless plasma and the recurrence of initial states. Phys. Rev. Lett., 15:240–243, Aug 1965.
- Zecheng Zhang. Modno: Multi operator learning with distributed neural operators. arXiv preprint arXiv:2404.02892, 2024.
- Belnet: basis enhanced learning, a mesh-free neural operator. Proceedings of the Royal Society A, 479(2276):20230043, 2023.
- A discretization-invariant extension and analysis of some deep operator networks. arXiv preprint arXiv:2307.09738, 2023.
- Bayesian deep operator learning for homogenized1 to fine-scale maps for multiscale pde. 2023.
- D2no: Efficient handling of heterogeneous input function spaces with distributed deep neural operators. arXiv preprint arXiv:2310.18888, 2023.
- Fourier-deeponet: Fourier-enhanced deep operator networks for full waveform inversion with improved accuracy, generalizability, and robustness. arXiv preprint arXiv:2305.17289, 2023.
- Toolqa: A dataset for llm question answering with external tools. Advances in Neural Information Processing Systems, 36, 2024.