Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
133 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Self-supervised Pretraining for Partial Differential Equations (2407.06209v1)

Published 3 Jul 2024 in cs.LG

Abstract: In this work, we describe a novel approach to building a neural PDE solver leveraging recent advances in transformer based neural network architectures. Our model can provide solutions for different values of PDE parameters without any need for retraining the network. The training is carried out in a self-supervised manner, similar to pretraining approaches applied in language and vision tasks. We hypothesize that the model is in effect learning a family of operators (for multiple parameters) mapping the initial condition to the solution of the PDE at any future time step t. We compare this approach with the Fourier Neural Operator (FNO), and demonstrate that it can generalize over the space of PDE parameters, despite having a higher prediction error for individual parameter values compared to the FNO. We show that performance on a specific parameter can be improved by finetuning the model with very small amounts of data. We also demonstrate that the model scales with data as well as model size.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (25)
  1. P Lynch, The origins of computer weather prediction and climate modeling. \JournalTitleJournal of Computational Physics 227, 3431–3444 (2008) Predicting weather, climate and extreme events.
  2. (Wiley), (2005).
  3. E Haber, Computational Methods in Geophysical Electromagnetics. (Society for Industrial and Applied Mathematics), (2014).
  4. \JournalTitleJournal of Computational Physics 375, 1339–1364 (2018).
  5. \JournalTitleJournal of Computational Physics 411, 109409 (2020).
  6. \JournalTitleJournal of Computational Physics 378, 686–707 (2019).
  7. \JournalTitleJournal of Machine Learning Research 24, 1-97 (2023).
  8. \JournalTitleEuropean Journal of Applied Mathematics 32, 421–435 (2020).
  9. (Association for Computing Machinery, New York, NY, USA), p. 481–490 (2016).
  10. \JournalTitleComputational Mechanics 64, 525–545 (2019).
  11. \JournalTitleCoRR abs/2108.08481 (2021).
  12. \JournalTitleCoRR abs/1910.03193 (2019).
  13. \JournalTitleComputer Methods in Applied Mechanics and Engineering 373, 113500 (2021).
  14. (2022).
  15. \JournalTitlearXiv preprint arXiv:2202.03376 (2022).
  16. (Curran Associates Inc., Red Hook, NY, USA), (2024).
  17. (2018).
  18. (2019).
  19. \JournalTitleAI Open 2, 225–250 (2021).
  20. \JournalTitleNature Machine Intelligence 4, 1256–1264 (2022).
  21. \JournalTitleThe International Journal of High Performance Computing Applications 37, 683–705 (2023).
  22. \JournalTitlearXiv preprint arXiv:2301.10343 (2023).
  23. \JournalTitleCoRR abs/2010.11929 (2020).
  24. \JournalTitleCoRR abs/1706.03762 (2017).
  25. \JournalTitleCoRR abs/1908.08962 (2019).

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com