Distillation of atomistic foundation models across architectures and chemical domains (2506.10956v1)
Abstract: Machine-learned interatomic potentials have transformed computational research in the physical sciences. Recent atomistic `foundation' models have changed the field yet again: trained on many different chemical elements and domains, these potentials are widely applicable, but comparably slow and resource-intensive to run. Here we show how distillation via synthetic data can be used to cheaply transfer knowledge from atomistic foundation models to a range of different architectures, unlocking much smaller, more efficient potentials. We demonstrate speed-ups of $> 10\times$ by distilling from one graph-network architecture into another, and $> 100\times$ by leveraging the atomic cluster expansion framework. We showcase applicability across chemical and materials domains: from liquid water to hydrogen under extreme conditions; from porous silica and a hybrid halide perovskite solar-cell material to modelling organic reactions. Our work shows how distillation can support the routine and computationally efficient use of current and future atomistic foundation models in real-world scientific research.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.