Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
162 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

OmniArch: Building Foundation Model For Scientific Computing (2402.16014v3)

Published 25 Feb 2024 in cs.LG and cs.AI

Abstract: Foundation models have revolutionized LLMing, while whether this success is replicated in scientific computing remains unexplored. We present OmniArch, the first prototype aiming at solving multi-scale and multi-physics scientific computing problems with physical alignment. We addressed all three challenges with one unified architecture. Its pre-training stage contains a Fourier Encoder-decoder fading out the disharmony across separated dimensions and a Transformer backbone integrating quantities through temporal dynamics, and the novel PDE-Aligner performs physics-informed fine-tuning under flexible conditions. As far as we know, we first conduct 1D-2D-3D united pre-training on the PDEBench, and it sets not only new performance benchmarks for 1D, 2D, and 3D PDEs but also demonstrates exceptional adaptability to new physics via in-context and zero-shot learning approaches, which supports realistic engineering applications and foresight physics discovery.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (51)
  1. Physical design using differentiable learned simulators. CoRR, abs/2202.00728, 2022. URL https://arxiv.org/abs/2202.00728.
  2. An overview on deep learning-based approximation methods for partial differential equations. arXiv preprint arXiv:2012.12348, 2020.
  3. Understanding robustness of transformers for image classification. In 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, October 10-17, 2021, pp.  10211–10221. IEEE, 2021. doi: 10.1109/ICCV48922.2021.01007. URL https://doi.org/10.1109/ICCV48922.2021.01007.
  4. Three ways to solve partial differential equations with neural networks—a review. GAMM-Mitteilungen, 44(2):e202100006, 2021.
  5. On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258, 2021.
  6. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
  7. Cao, S. Choose a transformer: Fourier or galerkin. Advances in neural information processing systems, 34:24924–24940, 2021.
  8. Decision transformer: Reinforcement learning via sequence modeling. In Ranzato, M., Beygelzimer, A., Dauphin, Y. N., Liang, P., and Vaughan, J. W. (eds.), Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6-14, 2021, virtual, pp.  15084–15097, 2021. URL https://proceedings.neurips.cc/paper/2021/hash/7f489f642a0ddb10272b5c31057f0663-Abstract.html.
  9. Scientific machine learning through physics–informed neural networks: Where we are and what’s next. Journal of Scientific Computing, 92(3):88, 2022.
  10. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.
  11. Medical image segmentation based on u-net: A review. Journal of Imaging Science and Technology, 2020.
  12. Transformers for modeling physical systems. Neural Networks, 146:272–289, 2022.
  13. xval: A continuous number encoding for large language models. CoRR, abs/2310.02989, 2023. doi: 10.48550/ARXIV.2310.02989. URL https://doi.org/10.48550/arXiv.2310.02989.
  14. Efficient token mixing for transformers via adaptive fourier neural operators. In International Conference on Learning Representations, 2021.
  15. Towards multi-spatiotemporal-scale generalized PDE modeling. CoRR, abs/2209.15616, 2022. doi: 10.48550/ARXIV.2209.15616. URL https://doi.org/10.48550/arXiv.2209.15616.
  16. Predicting physics in mesh-reduced space with temporal attention. arXiv preprint arXiv:2201.09113, 2022.
  17. Physics-informed machine learning. Nature Reviews Physics, 3(6):422–440, 2021.
  18. Learning operators with coupled attention. The Journal of Machine Learning Research, 23(1):9636–9698, 2022.
  19. Langley, P. Crafting papers on machine learning. In Langley, P. (ed.), Proceedings of the 17th International Conference on Machine Learning (ICML 2000), pp.  1207–1216, Stanford, CA, 2000. Morgan Kaufmann.
  20. Fourier neural operator for parametric partial differential equations. arXiv preprint arXiv:2010.08895, 2020.
  21. Physics-informed neural operator for learning partial differential equations. arXiv preprint arXiv:2111.03794, 2021.
  22. Scalable transformer for PDE surrogate modeling. CoRR, abs/2305.17560, 2023. doi: 10.48550/ARXIV.2305.17560. URL https://doi.org/10.48550/arXiv.2305.17560.
  23. A survey of transformers. AI Open, 2022.
  24. Physics informed token transformer. CoRR, abs/2305.08757, 2023. doi: 10.48550/ARXIV.2305.08757. URL https://doi.org/10.48550/arXiv.2305.08757.
  25. Learning nonlinear operators via deeponet based on the universal approximation theorem of operators. Nature machine intelligence, 3(3):218–229, 2021a.
  26. Deepxde: A deep learning library for solving differential equations. SIAM review, 63(1):208–228, 2021b.
  27. Multiple physics pretraining for physical surrogate models. CoRR, abs/2310.02994, 2023. doi: 10.48550/ARXIV.2310.02994. URL https://doi.org/10.48550/arXiv.2310.02994.
  28. Self-supervised learning with lie symmetries for partial differential equations. CoRR, abs/2307.05432, 2023. doi: 10.48550/ARXIV.2307.05432. URL https://doi.org/10.48550/arXiv.2307.05432.
  29. Oden, J. T. An introduction to the finite element method with applications to nonlinear problems (R. e. white). SIAM Rev., 31(3):512, 1989. doi: 10.1137/1031114. URL https://doi.org/10.1137/1031114.
  30. Fourcastnet: A global data-driven high-resolution weather model using adaptive fourier neural operators. CoRR, abs/2202.11214, 2022. URL https://arxiv.org/abs/2202.11214.
  31. Improving language understanding by generative pre-training. 2018.
  32. Language models are unsupervised multitask learners. OpenAI blog, 1(8):9, 2019.
  33. Learning transferable visual models from natural language supervision. In Meila, M. and Zhang, T. (eds.), Proceedings of the 38th International Conference on Machine Learning, ICML 2021, 18-24 July 2021, Virtual Event, volume 139 of Proceedings of Machine Learning Research, pp.  8748–8763. PMLR, 2021. URL http://proceedings.mlr.press/v139/radford21a.html.
  34. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys., 378:686–707, 2019. doi: 10.1016/J.JCP.2018.10.045. URL https://doi.org/10.1016/j.jcp.2018.10.045.
  35. Transformer-encoder and decoder models for questions on math. In Faggioli, G., Ferro, N., Hanbury, A., and Potthast, M. (eds.), Proceedings of the Working Notes of CLEF 2022 - Conference and Labs of the Evaluation Forum, Bologna, Italy, September 5th - to - 8th, 2022, volume 3180 of CEUR Workshop Proceedings, pp.  119–137. CEUR-WS.org, 2022. URL https://ceur-ws.org/Vol-3180/paper-07.pdf.
  36. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18, pp.  234–241. Springer, 2015.
  37. U-net and its variants for medical image segmentation: A review of theory and applications. Ieee Access, 9:82031–82057, 2021.
  38. Dgm: A deep learning algorithm for solving partial differential equations. Journal of computational physics, 375:1339–1364, 2018.
  39. Towards foundation models for scientific machine learning: Characterizing scaling and transfer behavior. CoRR, abs/2306.00258, 2023. doi: 10.48550/ARXIV.2306.00258. URL https://doi.org/10.48550/arXiv.2306.00258.
  40. Surrogate modeling for fluid flows based on physics-constrained deep learning without simulation data. Computer Methods in Applied Mechanics and Engineering, 361:112732, 2020.
  41. Pdebench: An extensive benchmark for scientific machine learning. In Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., and Oh, A. (eds.), Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, NeurIPS 2022, New Orleans, LA, USA, November 28 - December 9, 2022, 2022. URL http://papers.nips.cc/paper_files/paper/2022/hash/0a9747136d411fb83f0cf81820d44afb-Abstract-Datasets_and_Benchmarks.html.
  42. Llama: Open and efficient foundation language models. CoRR, abs/2302.13971, 2023a. doi: 10.48550/ARXIV.2302.13971. URL https://doi.org/10.48550/arXiv.2302.13971.
  43. Llama 2: Open foundation and fine-tuned chat models. CoRR, abs/2307.09288, 2023b. doi: 10.48550/ARXIV.2307.09288. URL https://doi.org/10.48550/arXiv.2307.09288.
  44. Factorized fourier neural operators. arXiv preprint arXiv:2111.13802, 2021.
  45. Attention is all you need. Advances in neural information processing systems, 30, 2017.
  46. Learning the solution operator of parametric partial differential equations with physics-informed deeponets. Science advances, 7(40):eabi8605, 2021.
  47. Emergent abilities of large language models. Trans. Mach. Learn. Res., 2022, 2022. URL https://openreview.net/forum?id=yzkSU5zdwD.
  48. Transformers in time series: A survey. arXiv preprint arXiv:2202.07125, 2022.
  49. In-context operator learning with data prompts for differential equation problems. Proceedings of the National Academy of Sciences, 120(39):e2310142120, 2023a.
  50. Prompting in-context operator learning with sensor data, equations, and natural language. CoRR, abs/2308.05061, 2023b. doi: 10.48550/ARXIV.2308.05061. URL https://doi.org/10.48550/arXiv.2308.05061.
  51. A comprehensive survey on pretrained foundation models: A history from bert to chatgpt. arXiv preprint arXiv:2302.09419, 2023.

Summary

We haven't generated a summary for this paper yet.