Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Towards Foundation Models for Scientific Machine Learning: Characterizing Scaling and Transfer Behavior (2306.00258v1)

Published 1 Jun 2023 in cs.LG, cs.NA, and math.NA

Abstract: Pre-trained ML models have shown great performance for a wide range of applications, in particular in NLP and computer vision (CV). Here, we study how pre-training could be used for scientific machine learning (SciML) applications, specifically in the context of transfer learning. We study the transfer behavior of these models as (i) the pre-trained model size is scaled, (ii) the downstream training dataset size is scaled, (iii) the physics parameters are systematically pushed out of distribution, and (iv) how a single model pre-trained on a mixture of different physics problems can be adapted to various downstream applications. We find that-when fine-tuned appropriately-transfer learning can help reach desired accuracy levels with orders of magnitude fewer downstream examples (across different tasks that can even be out-of-distribution) than training from scratch, with consistent behavior across a wide range of downstream examples. We also find that fine-tuning these models yields more performance gains as model size increases, compared to training from scratch on new downstream tasks. These results hold for a broad range of PDE learning tasks. All in all, our results demonstrate the potential of the "pre-train and fine-tune" paradigm for SciML problems, demonstrating a path towards building SciML foundation models. We open-source our code for reproducibility.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Shashank Subramanian (23 papers)
  2. Peter Harrington (22 papers)
  3. Kurt Keutzer (200 papers)
  4. Wahid Bhimji (24 papers)
  5. Dmitriy Morozov (29 papers)
  6. Michael Mahoney (18 papers)
  7. Amir Gholami (60 papers)
Citations (44)

Summary

We haven't generated a summary for this paper yet.