OmniArch: Building Foundation Model For Scientific Computing (2402.16014v3)
Abstract: Foundation models have revolutionized LLMing, while whether this success is replicated in scientific computing remains unexplored. We present OmniArch, the first prototype aiming at solving multi-scale and multi-physics scientific computing problems with physical alignment. We addressed all three challenges with one unified architecture. Its pre-training stage contains a Fourier Encoder-decoder fading out the disharmony across separated dimensions and a Transformer backbone integrating quantities through temporal dynamics, and the novel PDE-Aligner performs physics-informed fine-tuning under flexible conditions. As far as we know, we first conduct 1D-2D-3D united pre-training on the PDEBench, and it sets not only new performance benchmarks for 1D, 2D, and 3D PDEs but also demonstrates exceptional adaptability to new physics via in-context and zero-shot learning approaches, which supports realistic engineering applications and foresight physics discovery.
- Physical design using differentiable learned simulators. CoRR, abs/2202.00728, 2022. URL https://arxiv.org/abs/2202.00728.
- An overview on deep learning-based approximation methods for partial differential equations. arXiv preprint arXiv:2012.12348, 2020.
- Understanding robustness of transformers for image classification. In 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, October 10-17, 2021, pp. 10211–10221. IEEE, 2021. doi: 10.1109/ICCV48922.2021.01007. URL https://doi.org/10.1109/ICCV48922.2021.01007.
- Three ways to solve partial differential equations with neural networks—a review. GAMM-Mitteilungen, 44(2):e202100006, 2021.
- On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258, 2021.
- Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
- Cao, S. Choose a transformer: Fourier or galerkin. Advances in neural information processing systems, 34:24924–24940, 2021.
- Decision transformer: Reinforcement learning via sequence modeling. In Ranzato, M., Beygelzimer, A., Dauphin, Y. N., Liang, P., and Vaughan, J. W. (eds.), Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6-14, 2021, virtual, pp. 15084–15097, 2021. URL https://proceedings.neurips.cc/paper/2021/hash/7f489f642a0ddb10272b5c31057f0663-Abstract.html.
- Scientific machine learning through physics–informed neural networks: Where we are and what’s next. Journal of Scientific Computing, 92(3):88, 2022.
- Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.
- Medical image segmentation based on u-net: A review. Journal of Imaging Science and Technology, 2020.
- Transformers for modeling physical systems. Neural Networks, 146:272–289, 2022.
- xval: A continuous number encoding for large language models. CoRR, abs/2310.02989, 2023. doi: 10.48550/ARXIV.2310.02989. URL https://doi.org/10.48550/arXiv.2310.02989.
- Efficient token mixing for transformers via adaptive fourier neural operators. In International Conference on Learning Representations, 2021.
- Towards multi-spatiotemporal-scale generalized PDE modeling. CoRR, abs/2209.15616, 2022. doi: 10.48550/ARXIV.2209.15616. URL https://doi.org/10.48550/arXiv.2209.15616.
- Predicting physics in mesh-reduced space with temporal attention. arXiv preprint arXiv:2201.09113, 2022.
- Physics-informed machine learning. Nature Reviews Physics, 3(6):422–440, 2021.
- Learning operators with coupled attention. The Journal of Machine Learning Research, 23(1):9636–9698, 2022.
- Langley, P. Crafting papers on machine learning. In Langley, P. (ed.), Proceedings of the 17th International Conference on Machine Learning (ICML 2000), pp. 1207–1216, Stanford, CA, 2000. Morgan Kaufmann.
- Fourier neural operator for parametric partial differential equations. arXiv preprint arXiv:2010.08895, 2020.
- Physics-informed neural operator for learning partial differential equations. arXiv preprint arXiv:2111.03794, 2021.
- Scalable transformer for PDE surrogate modeling. CoRR, abs/2305.17560, 2023. doi: 10.48550/ARXIV.2305.17560. URL https://doi.org/10.48550/arXiv.2305.17560.
- A survey of transformers. AI Open, 2022.
- Physics informed token transformer. CoRR, abs/2305.08757, 2023. doi: 10.48550/ARXIV.2305.08757. URL https://doi.org/10.48550/arXiv.2305.08757.
- Learning nonlinear operators via deeponet based on the universal approximation theorem of operators. Nature machine intelligence, 3(3):218–229, 2021a.
- Deepxde: A deep learning library for solving differential equations. SIAM review, 63(1):208–228, 2021b.
- Multiple physics pretraining for physical surrogate models. CoRR, abs/2310.02994, 2023. doi: 10.48550/ARXIV.2310.02994. URL https://doi.org/10.48550/arXiv.2310.02994.
- Self-supervised learning with lie symmetries for partial differential equations. CoRR, abs/2307.05432, 2023. doi: 10.48550/ARXIV.2307.05432. URL https://doi.org/10.48550/arXiv.2307.05432.
- Oden, J. T. An introduction to the finite element method with applications to nonlinear problems (R. e. white). SIAM Rev., 31(3):512, 1989. doi: 10.1137/1031114. URL https://doi.org/10.1137/1031114.
- Fourcastnet: A global data-driven high-resolution weather model using adaptive fourier neural operators. CoRR, abs/2202.11214, 2022. URL https://arxiv.org/abs/2202.11214.
- Improving language understanding by generative pre-training. 2018.
- Language models are unsupervised multitask learners. OpenAI blog, 1(8):9, 2019.
- Learning transferable visual models from natural language supervision. In Meila, M. and Zhang, T. (eds.), Proceedings of the 38th International Conference on Machine Learning, ICML 2021, 18-24 July 2021, Virtual Event, volume 139 of Proceedings of Machine Learning Research, pp. 8748–8763. PMLR, 2021. URL http://proceedings.mlr.press/v139/radford21a.html.
- Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys., 378:686–707, 2019. doi: 10.1016/J.JCP.2018.10.045. URL https://doi.org/10.1016/j.jcp.2018.10.045.
- Transformer-encoder and decoder models for questions on math. In Faggioli, G., Ferro, N., Hanbury, A., and Potthast, M. (eds.), Proceedings of the Working Notes of CLEF 2022 - Conference and Labs of the Evaluation Forum, Bologna, Italy, September 5th - to - 8th, 2022, volume 3180 of CEUR Workshop Proceedings, pp. 119–137. CEUR-WS.org, 2022. URL https://ceur-ws.org/Vol-3180/paper-07.pdf.
- U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18, pp. 234–241. Springer, 2015.
- U-net and its variants for medical image segmentation: A review of theory and applications. Ieee Access, 9:82031–82057, 2021.
- Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics, 375:1339–1364, 2018.
- Towards foundation models for scientific machine learning: Characterizing scaling and transfer behavior. CoRR, abs/2306.00258, 2023. doi: 10.48550/ARXIV.2306.00258. URL https://doi.org/10.48550/arXiv.2306.00258.
- Surrogate modeling for fluid flows based on physics-constrained deep learning without simulation data. Computer Methods in Applied Mechanics and Engineering, 361:112732, 2020.
- Pdebench: An extensive benchmark for scientific machine learning. In Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., and Oh, A. (eds.), Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, NeurIPS 2022, New Orleans, LA, USA, November 28 - December 9, 2022, 2022. URL http://papers.nips.cc/paper_files/paper/2022/hash/0a9747136d411fb83f0cf81820d44afb-Abstract-Datasets_and_Benchmarks.html.
- Llama: Open and efficient foundation language models. CoRR, abs/2302.13971, 2023a. doi: 10.48550/ARXIV.2302.13971. URL https://doi.org/10.48550/arXiv.2302.13971.
- Llama 2: Open foundation and fine-tuned chat models. CoRR, abs/2307.09288, 2023b. doi: 10.48550/ARXIV.2307.09288. URL https://doi.org/10.48550/arXiv.2307.09288.
- Factorized fourier neural operators. arXiv preprint arXiv:2111.13802, 2021.
- Attention is all you need. Advances in neural information processing systems, 30, 2017.
- Learning the solution operator of parametric partial differential equations with physics-informed deeponets. Science advances, 7(40):eabi8605, 2021.
- Emergent abilities of large language models. Trans. Mach. Learn. Res., 2022, 2022. URL https://openreview.net/forum?id=yzkSU5zdwD.
- Transformers in time series: A survey. arXiv preprint arXiv:2202.07125, 2022.
- In-context operator learning with data prompts for differential equation problems. Proceedings of the National Academy of Sciences, 120(39):e2310142120, 2023a.
- Prompting in-context operator learning with sensor data, equations, and natural language. CoRR, abs/2308.05061, 2023b. doi: 10.48550/ARXIV.2308.05061. URL https://doi.org/10.48550/arXiv.2308.05061.
- A comprehensive survey on pretrained foundation models: A history from bert to chatgpt. arXiv preprint arXiv:2302.09419, 2023.