ML Interatomic Potentials

Updated 16 January 2026

Machine-learning interatomic potentials are mathematical models that approximate potential energy surfaces for atomistic simulations with near first-principles accuracy.
These models use symmetry-preserving descriptors and advanced regression techniques, such as graph neural networks and Gaussian approximation potentials, to capture atomic interactions.
Robust MLIPs integrate diverse training data and physics-informed strategies to balance accuracy, transferability, and computational efficiency across various material systems.

Machine-learning interatomic potentials (MLIPs) are mathematical models designed to approximate potential energy surfaces for atomic-scale simulations with near first-principles accuracy and computational efficiency orders-of-magnitude above ab initio quantum methods. MLIPs form the backbone of modern molecular dynamics, structural optimization, and materials design workflows, enabling predictive, data-driven simulation of inorganic, organic, and disordered systems from the nanometer to micrometer scale. The rapid evolution of MLIP formalisms—including symmetry-preserving local descriptors, differentiable regression architectures, and physics-informed training strategies—has catalyzed breakthroughs in transferability, data efficiency, and high-throughput applications across chemistry, condensed matter, and metallurgy.

1. Core Theory and Descriptor Formulations

MLIPs decompose the total system energy into a sum over atom-centered energies:

$E_{\mathrm{tot}} = \sum_{i=1}^{N} E_i(\mathcal{X}_i)$

where $\mathcal{X}_i$ encodes the local atomic environment within a cutoff $r_c$ as a high-dimensional descriptor vector. The most widespread descriptor families include:

Behler–Parrinello symmetry functions: Two-, three-body Gaussian functions and angular terms, promoting invariance under translation, rotation, and permutation (Zuo et al., 2019).
Smooth overlap of atomic positions (SOAP): Density expansions in radial and spherical harmonic bases, yielding complete, systematizable invariants (Zuo et al., 2019).
Moment tensor potentials (MTP): Scalar invariants formed by contraction of tensor products of neighbor vectors, capturing systematic multi-body correlations (Pandey et al., 2022, Gubaev et al., 2018).
Graph-based and equivariant message passing: Atom-centered graph neural networks (GNNs) propagate features using steerable tensor fields or self-attention mechanisms to encode both geometric and chemical contexts; notable examples include NequIP, Allegro, CHGNet, ACE/MACE, and DPA-Semi (Leimeroth et al., 5 May 2025, Wen et al., 2024, Liu et al., 2023, Yu et al., 2024).

Descriptor dimensionality, radial/angle cutoff radii, and symmetry constraints are hyperparameterized per chemical system and model architecture.

2. Model Classes and Regression Frameworks

MLIP architectures vary by complexity, data efficiency, computational cost, and extrapolative robustness:

Feed-forward neural networks (NNP/HDNNP): Map symmetry descriptors to $E_i$ via multilayer networks; optimized for simplicity in small-composition or phase spaces (Zuo et al., 2019, Rili, 2024).
Gaussian approximation potentials (GAP): Kernel-ridge regression over SOAP and pairwise descriptors, with sparse reference environments and Bayesian regularization (Zuo et al., 2019, Fellman et al., 2024).
Moment tensor potentials (MTP): Linear regression over contracted tensor basis functions; well suited for complex alloys and fast relaxation (Pandey et al., 2022, Gubaev et al., 2018).
Graph neural networks (GNNs): Leverage equivariant architectures (MACE, NequIP, Allegro, CAMP, CACE, DPA-Semi) for systematic body-order completeness and direct atomic environment embedding (Leimeroth et al., 5 May 2025, Wen et al., 2024, Cheng, 2024, Liu et al., 2023).
Universal MLIPs (uMLIPs): Pretrained, transferable GNNs spanning the chemical space with minimal per-system tuning (CHGNet, M3GNet-DIRECT, ALIGNN-FF, DPA-Semi) (Yu et al., 2024, Liu et al., 2023).

Regression typically targets a weighted loss over energy, force, and stress errors:

$L = w_E\,\frac{1}{N_E} \sum (E_{\mathrm{pred}}-E_{\mathrm{ref}})^2 + w_F\,\frac{1}{3N_F} \sum \|\mathbf{F}_{i,\mathrm{pred}}-\mathbf{F}_{i,\mathrm{ref}}\|^2$

with additional regularization on model parameters.

3. Training Data Generation and Strategies

Construction of robust MLIPs demands diverse reference datasets sampled from quantum-mechanical calculations (typically DFT, but also CCSD(T)/CBS, r $^2$ SCAN, etc.):

Static structures: Crystals, surfaces, defects, and alloys sampled over strain, volume, and composition (Pandey et al., 2022, Fellman et al., 2024).
Ab initio MD snapshots: Capturing vibrational, thermal, and liquid configurations at a range of temperatures (Pandey et al., 2022, Gong et al., 16 Aug 2025).
Active learning: On-the-fly relaxation and D-optimality selection to maximize training set informativeness while minimizing ab initio labeling (Gubaev et al., 2018).
Multi-fidelity learning: Joint training on multiple levels of theory (e.g., PBE, meta-GGA, CCSD(T)), leveraging abundant low-fidelity data with minimal high-fidelity coverage for optimal accuracy (Kim et al., 2024).
Ensemble knowledge distillation: Generating synthetic force labels via ensemble predictions when original QC datasets include only energies (Matin et al., 18 Mar 2025).

Dataset design principles prioritize phase, composition, and environment diversity to ensure interpolation robustness and controlled extrapolation risk.

4. Accuracy Metrics, Computational Cost, and Model Selection

Benchmarking frameworks evaluate MLIPs across:

Energy/force/stress RMSE/MAE: MeV/atom and eV/Å accuracy; SOAP-based GAP and body-order complete GNNs (MACE, NequIP) achieve $<5$ meV/atom and $<0.05$ eV/Å for well-sampled systems (Zuo et al., 2019, Leimeroth et al., 5 May 2025, Wen et al., 2024).
Material properties: Prediction of elastic constants, phonon spectra, migration barriers, melting points, stacking fault and surface energies within $\sim 1-10\%$ of DFT or experiment (Pandey et al., 2022, Fellman et al., 2024, Liu et al., 2023).
Computational scaling: GAP-SOAP and high-dimensional GNNs require substantial resources (CPU/GPU and memory), but tabular/interpolated models (tabGAP) approach classical IPs for efficiency (Byggmästar et al., 2022, Fellman et al., 2024). GPU acceleration may reach $\sim 10^4$ times DFT speed for million-atom MD (Leimeroth et al., 5 May 2025).
Extrapolation and MD stability: Model performance degrades outside high-density reference domains; minimalist models, nonlinear ACE, and PIWSL approaches improve stability and smoothness (Robredo-Magro et al., 21 Nov 2025, Takamoto et al., 2024).

Pareto optimization balances ultimate accuracy, cost, and usability; nonlinear ACE, MACE, Allegro, and NequIP occupy dominant speed/accuracy frontiers for complex materials (Leimeroth et al., 5 May 2025).

5. Physics-Informed and Weakly Supervised Extensions

Recent advances address deficiencies in generalization and conservative force prediction:

Physics-Informed Weakly Supervised Learning (PIWSL): Incorporates Taylor-expansion consistency and path-independence (PITC, PISC losses) into MLIP training. This enforces local energy-force response and conservative forces, yielding up to $2.6\times$ reduction in energy errors and $10-30\%$ reduction in force errors under data scarcity. PIWSL also supports training without direct force reference, enabling fine-tuning on high-level energies only (Takamoto et al., 2024).
Multi-fidelity and ensemble distillation: Enables MLIPs trained on partial or weak labels—e.g., only energies via ensemble force distillation—to reach near benchmark performance and enhanced MD stability (Matin et al., 18 Mar 2025, Kim et al., 2024).

Integration with modern toolkits (Open Catalyst, DeePMD-kit) and robust hyperparameter tuning (e.g., Optuna) are critical for deployment.

6. High-Throughput Materials Design, Universal MLIPs, and Domain-Specific Impact

MLIPs unlock computationally intractable workflows in materials discovery and chemistry:

High-entropy/alloy screening and elastic/mechanical optimization: MTP-MLIPs combined with automated composition sampling enable rational design of alloys and direct property mapping, achieving near-DFT accuracy for bulk and mechanical constants (Pandey et al., 2022, Byggmästar et al., 2022).
Universal MLIPs: Attention-based GNN potentials (DPA-Semi, CHGNet, M3GNet-DIRECT, MACE-MP-0, ALIGNN-FF) generalize across hundreds of elements and crystal prototypes with no per-system retraining (Liu et al., 2023, Yu et al., 2024).
Disordered and amorphous systems: Fine-tuning universal models (CHGNet) on amorphous alloy datasets yields transferable MLIPs accurately predicting density, $E$ , $T_g$ , Young's modulus and enabling direct composition–property mapping (Gong et al., 16 Aug 2025).
Ferroelectric, phase-change, and molecular systems: Minimalist MLIPs and body-order complete GNNs reproduce complex phase transitions, topologies, and nonlinear effects even on sparse, default-data regimes (Robredo-Magro et al., 21 Nov 2025, Wen et al., 2024).

Universal MLIP capabilities are rapidly expanding with transfer-learning, attention weighting, and comprehensive data benchmarks, enabling foundation potentials for the periodic table (Liu et al., 2023, Yu et al., 2024).

7. Emerging Directions, Limitations, and Best Practices

Current challenges and active frontiers include:

Extrapolation risk management: Out-of-domain prediction remains vulnerable to unphysical output; active learning and physics-based constraints partially mitigate (Takamoto et al., 2024, Mishin, 2021).
Incomplete conservative force enforcement: While curl reduction is possible, full path-independence and global energy–force consistency pose an open problem [240