Papers
Topics
Authors
Recent
Search
2000 character limit reached

Machine-Learned Interatomic Potentials

Updated 5 February 2026
  • Machine-learned interatomic potentials are data-driven models that predict atomic energies and forces with near-quantum accuracy using locality-based energy partitioning.
  • They leverage symmetry-preserving descriptors like SOAP and Behler–Parrinello functions with regression models including neural networks and kernel methods for enhanced transferability.
  • Applications span molecular dynamics, defect analysis, and phase transitions, employing active learning and multiscale strategies to optimize accuracy and computational cost.

Machine-learned interatomic potentials (MLIPs) are data-driven models that predict atomic energies and forces with near-first-principles accuracy and are employed as surrogates for explicit quantum-mechanical calculations in atomistic simulations. MLIPs have rapidly supplanted classical analytic potentials in many areas of materials science, computational chemistry, and molecular simulation, offering systematic improvements in accuracy, transferability, and flexibility. Their deployment now spans static structure optimization, molecular dynamics (MD) of complex and anharmonic systems, high-throughput property screening, and multi-scale modeling frameworks.

1. Mathematical Foundations and Model Architectures

MLIPs universally adopt a locality-based energy partitioning, expressing the total energy as a sum over atomic contributions: E=∑i=1NEi({rij}j: ∣rij∣<rc)E = \sum_{i=1}^N E_i(\{\mathbf{r}_{ij}\}_{j:\,|\mathbf{r}_{ij}|<r_c}) where each EiE_i depends only on neighboring atoms within cutoff radius rcr_c (Mishin, 2021). To achieve rotational, translational, and permutational invariance, diverse descriptors and regression models have been designed:

Descriptors:

Regression Models:

2. Training Protocols, Data Generation, and Loss Functions

Data Selection and Sampling:

  • Accurate and transferable MLIPs require representative, diverse training sets covering relevant regions of configuration space. Automated approaches include information-entropy maximization, leverage-score subsampling, and active-learning protocols that couple MD exploration with uncertainty quantification to supplement training data only where the current MLIP is most uncertain (Baghishov et al., 6 Jun 2025, Kang et al., 2024, Verdi et al., 2021, Allen et al., 2022).

Labeling:

  • Energies and forces (and, in some applications, stresses/virials) are computed from ab initio methods (DFT, CCSD(T), etc.) at variable levels of electronic structure precision (Baghishov et al., 6 Jun 2025, Matin et al., 18 Mar 2025). The choice of electronic structure ("reference level") sets a floor on attainable MLIP accuracy.

Loss Functions and Regularization:

  • Training most often minimizes a composite loss combining energy and force errors: L=∑m=1M{wE2Nm2(E^m−Em)2+wF2∑i=13Nm(F^mi−Fmi)2}L = \sum_{m=1}^M \left\{ \frac{w_E^2}{N_m^2}(\hat{E}_m - E_m)^2 + w_F^2 \sum_{i=1}^{3N_m} (\hat{F}_{mi} - F_{mi})^2 \right\} with user-controlled weighting (wE,wF)(w_E, w_F) to balance energy/force accuracy. Recent developments incorporate physically-informed losses enforcing Taylor-consistency between energies and forces as well as energy conservation or symmetry-restoration terms (Takamoto et al., 2024, Matin et al., 18 Mar 2025, Bigi et al., 22 Jan 2026).

Model Complexity:

3. Model Evaluation, Computational Efficiency, and Software

Quantitative Benchmarks:

Evaluation Speed and Tabulation:

Integration and Software:

  • Standardized software frameworks and libraries now exist for development, training, and deployment of MLIPs (e.g., QUIP/ASE, mlip, open-source LAMMPS/ML-MIX integration) (Brunken et al., 28 May 2025, Birks et al., 26 Feb 2025). These expose APIs for various model types and molecular dynamics engines, including JAX-MD and ASE.

4. Advanced Methods: Multiscale Coupling, Hybridization, and Active Learning

Spatial Mixing and Multiscale Schemes:

  • The ML-MIX approach enables spatially inhomogeneous application of MLIPs by mixing expensive (high-accuracy) and cheap (low-order/fitted for efficiency) MLIPs over the simulation domain, via smooth per-atom weighting functions. This reduces the computational cost in large-scale defect or catalysis simulations by up to ∼\sim11×\times for ∼\sim8000 atoms and even greater in larger domains, without significant loss of accuracy in the "active" regions (Birks et al., 26 Feb 2025).
  • Constrained linear fitting of the cheap MLIP (enforcing exact elastic constants and optimal reproduction of the expensive reference in bulk configurations) ensures thermodynamic and mechanical consistency within the blended region (Birks et al., 26 Feb 2025).
  • Hybrid QM/MM–MLIP coupling for crystalline defects can be rigorously error-analyzed, with controllable error bounds via Taylor and virial matching in the MM/MLIP region, using atomic cluster expansion descriptors and linear least-squares fitting (Chen et al., 2021).

Physically Informed and Weakly Supervised Learning:

  • Physics-informed losses (Taylor expansion, conservative-force consistency) injected into the training objective eliminate unphysical force/energy artifacts and yield improved MD stability, even when training labels are sparse or forces are missing (Takamoto et al., 2024).
  • Ensemble knowledge distillation allows energies-only datasets (e.g., high-level quantum chemistry where calculating forces is infeasible) to provide force supervision to student MLIPs via ensemble-averaged teacher models, enhancing both energy and force accuracy (Matin et al., 18 Mar 2025).

Transferability and Minimalist Approaches:

  • Studies challenge the prevailing notion that exhaustive datasets and hyperparameter optimization are necessary; minimalist MLIPs trained on limited, auto-selected datasets can discover nontrivial structures (e.g., new polymorphs, topological textures) well outside the training domain, especially when descriptors are sufficiently expressive (e.g., SOAP, Allegro) (Robredo-Magro et al., 21 Nov 2025).

Active Learning for Strongly Anharmonic Regimes:

  • Active learning loops and ensemble uncertainty estimates, when combined with MLIP-MD, enable robust coverage of strongly anharmonic configuration space and prevent failures such as missing/fake metastable states—crucial for accurate thermal transport and rare-event kinetics (Kang et al., 2024, Verdi et al., 2021).

5. Applications and Demonstrated Impact

Static and Dynamic Properties:

Scale Bridging:

Performance Table: Model Cost and Accuracy (example summary)

Model type Energy RMSE (meV/atom) Force RMSE (meV/Å) Eval Time (μs/atom·step) Reference
Nonlinear ACE 8–12 30–35 0.2–0.4 (Leimeroth et al., 5 May 2025)
NequIP/MACE/Allegro GNNs 6–14 25–55 4.5–6 (Leimeroth et al., 5 May 2025Robredo-Magro et al., 21 Nov 2025)
Tabulated low-dim GAP ~2–3 ~40–100 0.1–0.3 (Byggmästar et al., 2022Fellman et al., 2024)
SOAP-GAP 3–8 25–80 2.0–4.0 (Byggmästar et al., 2022Robredo-Magro et al., 21 Nov 2025)
Classical EAM/MEAM 15–100 200–600 0.005–0.1 (Leimeroth et al., 5 May 2025Marchant et al., 2022)

Values are material and system dependent. Full tables with details in (Leimeroth et al., 5 May 2025, Byggmästar et al., 2022, Fellman et al., 2024, Rosenbrock et al., 2019).

6. Practical Guidelines and Recommendations


Machine-learned interatomic potentials now form the backbone of high-accuracy, large-scale atomistic simulation, bridging first-principles and empirical methods with theory-driven design, scalable architectures, and integrated practical workflows (Leimeroth et al., 5 May 2025, Mishin, 2021, Birks et al., 26 Feb 2025, Robredo-Magro et al., 21 Nov 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (19)

Topic to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Machine-Learned Interatomic Potentials.