Papers
Topics
Authors
Recent
Search
2000 character limit reached

Atomic Foundation Models Overview

Updated 24 December 2025
  • Atomic Foundation Models are large-scale, pre-trained machine learning models that provide physically consistent predictions for energy, force, and properties in atomistic simulations.
  • They leverage advanced equivariant architectures and rigorous scaling laws to ensure size-consistent and symmetry-preserving computations across diverse chemical spaces.
  • These models enable practical applications in materials science, chemistry, and biology through fine-tuning, transfer learning, and efficient surrogate distillation techniques.

Atomic Foundation Models are large-scale, pre-trained machine learning models designed to deliver general-purpose, physically consistent predictive capabilities for atomistic simulations across materials, molecules, and chemical systems. These models are trained on broad chemical spaces covering many elements and structures, and they provide a unified architecture for accurate energy, force, and property prediction. Atomic Foundation Models inherit principles of scale—expressed through data and parameter scaling laws—robust 3D representations, and advanced equivariant architectures, parallel to the role of LLMs and vision FMs in their respective domains (Yuan et al., 13 Mar 2025, Batatia et al., 2023).

1. Mathematical Formalism and Model Definition

Atomic Foundation Models (AFMs) universally implement an atom-centered energy decomposition: E(C)  =  i=1NEi({rij}jN(i);θ)E(\mathcal{C}) \;=\; \sum_{i=1}^{N} E_i\left(\{ r_{ij} \}_{j \in \mathcal{N}(i)}; \theta \right) where C={(Zi,ri)}\mathcal{C} = \{(Z_i, \mathbf{r}_i)\} denotes the atomic configuration, NN is system size, N(i)\mathcal{N}(i) is the neighbor list for atom ii (within cutoff rcutr_{cut}), and θ\theta are parameters. Forces follow by analytic differentiation: Fi=riE(C)\mathbf{F}_i = -\nabla_{\mathbf{r}_i} E(\mathcal{C}) The mapping must exhibit correct invariances—translational, permutational, often rotational equivariance: E({Zi,ri})=E({Zi,ri+t});E({Zi,Rri})=E({Zi,ri})E(\{Z_i, \mathbf{r}_i \}) = E(\{Z_i, \mathbf{r}_i + \mathbf{t}\}); \quad E(\{Z_i, R\mathbf{r}_i \}) = E(\{Z_i, \mathbf{r}_i \}) for any translation t\mathbf{t} and rotation matrix RR (Yuan et al., 13 Mar 2025, Chen et al., 2024, Li et al., 5 Dec 2025).

2. Scaling Laws, Data Regimes, and Architectural Families

AFMs are governed by neural scaling relations: RMSE(D,P)DαPβ\mathrm{RMSE}(D, P) \propto D^{-\alpha} P^{-\beta} with exponents α0.20.3\alpha \approx 0.2-0.3, β0.10.2\beta \approx 0.1-0.2 for atomistic problems (Yuan et al., 13 Mar 2025, Wood et al., 30 Jun 2025). UMA models demonstrate compute-optimal scaling for atomic tasks, where model size and dataset size should be balanced according to iso-FLOPs scaling laws (Wood et al., 30 Jun 2025): logN ⁣(C)=αlogC+A;logD ⁣(C)=βlogC+B\log N^*\!(C) = \alpha\,\log C + A; \quad \log D^*\!(C) = \beta\,\log C + B MoLE layers allow model capacity to increase without slowing inference, enabling high-throughput simulation with billions of parameters but only tens of millions of active parameters per atomic system.

Architectural families include:

These architectures guarantee strict size-consistency, locality, and physical invariance, and are benchmarked for cross-domain performance (Batatia et al., 2023, Nomura et al., 9 Feb 2025, Wood et al., 30 Jun 2025).

3. Pre-training Strategies and Representation Manifolds

Pre-training of AFMs exploits massive datasets—Materials Project, OMat24, OC20++, OMol25, etc.—with objectives:

  • Supervised energy–force regression:

LEF=λEEE^2+λFiFiF^i2\mathcal{L}_{EF} = \lambda_E \| E - \hat{E} \|^2 + \lambda_F \sum_i \| \mathbf{F}_i - \hat{\mathbf{F}}_i \|^2

To address the lack of interoperability across model architectures, the Platonic representation projects all model-specific atomic embeddings onto a common anchor-defined manifold, using cosine similarities to a DIRECT-sampled anchor basis: Ti(ei)=zi=[zi1,...,ziK]T;zik=cos(ei,ak)T_i(\mathbf{e}_i) = \mathbf{z}_i = [z_{i1}, ..., z_{iK}]^T; \quad z_{ik} = \cos(\mathbf{e}_i, \mathbf{a}_k) yielding a latent space that preserves periodicity, symmetry, and enables model-to-model optimal transport and arithmetic (Li et al., 5 Dec 2025). Embedding arithmetic generalizes to material-level and reaction-level algebra, with compatible cross-model operations.

4. Fine-tuning, Transfer Learning, and Surrogate Distillation

AFMs are adapted to downstream tasks using minimal data and layers modulation:

  • Frozen transfer learning: Selectively re-train upper layers (e.g., latest interaction blocks or readout head), keeping most architecture frozen for efficiency (Radova et al., 21 Feb 2025).
  • Distillation: Teacher FMs generate synthetic data via rattle-relax, students (e.g., ACE, PaiNN) are fit to FM outputs, achieving up to 100×100\times inference speed-ups with marginal accuracy loss (Gardner et al., 12 Jun 2025).
  • Δ-learning via GPR: Residual corrections are learned atop internal model embeddings with Gaussian-process regression, using species- or atomic-level aggregation to fix coverage-induced errors (e.g., underrepresented metal–sulfur chemistry) (Christiansen et al., 28 Feb 2025).

Typical simulation workflows proceed: pre-trained FM → targeted fine-tuning (partial freezing) → validation → surrogate model fit → high-throughput production MD (Radova et al., 21 Feb 2025, Gardner et al., 12 Jun 2025).

5. Model Evaluation, Interoperability, and Bias Detection

Universal metrics are required to assess AFM performance:

The Platonic framework demonstrates how local architectures yield consistent, interpretable global statistics when projected, and allows early diagnostic detection of training biases or symmetry breaking.

6. Domain Coverage, Applications, and Emergent Capabilities

Atomic Foundation Models are validated across:

Emergent behaviors observed include robust cross-domain generalization, zero/few-shot adaptation, mechanistic interpretability, and algebraic compatibility of embeddings (Li et al., 5 Dec 2025, Wadell et al., 20 Oct 2025, Batatia et al., 2023).

7. Open Challenges and Future Directions

Critical frontiers for Atomic Foundation Models:

Realizing the full promise of AFMs will require coordinated, large-scale data acquisition, algorithmic innovation for physics-aware learning, scalable compute, and comprehensive benchmarking that measures both foundational capacity and practical simulation utility.


Atomic Foundation Models unify disparate atomistic representations, leveraging massive datasets and equivariant architectures to enable scalable, interpretable, and robust simulations for materials, chemistry, and biological systems. Projection-based frameworks such as the Platonic representation support true interoperability, diagnostic error analysis, and embedding arithmetic across models, positioning AFMs as the scientific infrastructure for next-generation atomistic simulation (Li et al., 5 Dec 2025, Yuan et al., 13 Mar 2025, Wood et al., 30 Jun 2025, Batatia et al., 2023).

Topic to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Atomic Foundation Models.