Machine Learning Interatomic Potentials

Updated 20 October 2025

MLIPs are computational models that use machine learning to approximate atomic potential energy surfaces with ab initio accuracy and improved efficiency.
They employ extensive sets of radial and angular descriptors alongside advanced regression methods like neural networks, Gaussian processes, and graph neural networks.
MLIPs enable high-throughput materials screening, multi-fidelity training, and automated simulation pipelines for robust prediction of complex materials properties.

Machine Learning Interatomic Potentials (MLIPs) are computational models that approximate the potential energy surface of atomic systems by employing machine learning algorithms trained on quantum-mechanical data. MLIPs serve as a bridge between classical force fields and first-principles electronic structure methods, offering both ab initio–level accuracy and dramatically increased computational efficiency. The field has seen rapid methodological advances, notably in descriptor construction, regression architectures, data efficiency, active learning, and the systematic treatment of complex materials properties. This article reviews the conceptual foundations, major methodologies, accuracy/cost trade-offs, benchmark findings, and practical aspects situating MLIPs in contemporary materials and molecular simulation.

1. Conceptual Foundations and Descriptor Formalism

The conceptual basis for modern MLIPs lies in an extension of traditional embedding-energy formulations (such as those underpinning the Embedded Atom Method, EAM, and Modified Embedded Atom Method, MEAM), but generalizes them significantly by moving beyond the uniform density approximation (UDA). Traditional potentials typically reduce the atomic energy to a functional of a scalar local electron density, e.g., $E^{(i)} = F(\rho(r_i))$ with $\rho(r_i) = \sum_j p(r_{ij})$ (Takahashi et al., 2017). Angular (three-body) terms are incorporated in MEAM, but both EAM and MEAM are limited in their flexibility due to a small number of functional degrees of freedom and descriptors.

MLIPs systematize and greatly extend this framework by constructing energy functionals over large sets of (radial and angular) descriptors, $E^{(i)} = F(b_{10}^{(i)}, b_{20}^{(i)}, ..., b_{n_{max}, l_{max}}^{(i)})$ , where the $b_{nl}$ are systematically constructed from the local atomic environment—pairwise ( $l=0$ ) and higher-body angular ( $l > 0$ ) (Takahashi et al., 2017). For example, $b_{n0}^{(i)} = \sum_j f_n(r_{ij})$ (generalized pairwise) and $b_{nl}^{(i)} = \sum_{j,k} f_n(r_{ij}) f_n(r_{ik}) \cos^l(\gamma_{jik})$ (angular, inspired by spherical harmonics). The expansion to tens of thousands of descriptors enables MLIPs to represent a much broader class of interatomic interactions than classical force fields while preserving interpretability.

More recent innovations include explicit treatment of high body-order correlations via atomic cluster expansion (ACE), moment tensor potentials (MTP), and structured message passing on graphs (see Section 2), as well as frameworks that allow Cartesian tensor constructions rather than relying exclusively on spherical harmonics (Wen et al., 18 Nov 2024).

2. Regression Strategies and Model Architectures

Advances in descriptor sets have paralleled diversification in regression methodologies:

Linearized MLIPs and Polynomial Expansions: Early high-accuracy MLIPs exploited large descriptor sets but employed relatively simple polynomial regressors, demonstrating that most improvements stemmed from descriptor richness rather than additional model nonlinearity (Takahashi et al., 2017).
Gaussian Process Regression and Kernel Methods: Models such as the Gaussian Approximation Potential (GAP) employed non-parametric regression (SOAP kernel) to offer systematic improvement at increased computational cost.
Artificial Neural Networks (ANNs): HDNNP and AENET utilize atom-centered neural networks with flexible representations, though often requiring careful regularization to avoid overfitting in small data regimes (Choyal et al., 2023).
Moment Tensor Potentials (MTP): These models use tensor basis expansions of local environments, efficiently capturing both radial and angular information and enabling data-efficient, accurate regression (Pandey et al., 2022, Choyal et al., 2023).
Atomic Cluster Expansion (ACE): Both linear and nonlinear ACE frameworks allow systematic order-by-order improvement and competitive accuracy/cost trade-offs. Nonlinear ACE, in particular, forms a Pareto front in accuracy–speed analyses (Leimeroth et al., 5 May 2025).
Equivariant Graph Neural Networks (GNNs): Methods such as NequIP, Allegro, and MACE perform message passing over atomic graphs, explicitly encoding spatial symmetries (E(3)-equivariance) and supporting near ab initio accuracy on complex systems (Leimeroth et al., 5 May 2025, Brunken et al., 28 May 2025).

Recent Innovations: Cartesian Atomic Moment Potentials (CAMP) utilize Cartesian tensor representations for atomistic environments, allowing efficient contraction and systematic order improvement while achieving competitive accuracy and speed (Wen et al., 18 Nov 2024). Additionally, Transformer-based MLIPs (TransIP) relax architectural symmetry constraints, instead enforcing symmetry compliance in a latent embedding space through contrastive loss, thereby improving flexibility and scalability while retaining equivariance (Elhag et al., 25 Sep 2025).

3. Data Efficiency, Active Learning, and Multi-fidelity Training

The computational cost in MLIP construction is dominated by the generation of high-quality reference data (most often from DFT, but increasingly from higher-accuracy wavefunction theory). Several strategies have been developed for maximizing data efficiency and maintaining transferability:

Advanced Sampling and Subset Selection: Information-entropy maximization, leverage (CUR) sampling, and block leverage scoring are used to ensure training sets cover diverse atomic configurations and avoid redundancy, reducing computational effort by factors of 10–100 with negligible loss in accuracy (Baghishov et al., 6 Jun 2025).
Active Learning (AL): Iterative schemes that combine on-the-fly exploration with uncertainty quantification guide MD simulations toward poorly learned regions of configuration space, allowing targeted acquisition of new quantum data. These approaches are especially important for strongly anharmonic materials and rare-event sampling (Kang et al., 18 Sep 2024).
Multi-fidelity Learning: By combining data at different ab initio levels (e.g., low-fidelity PBE, high-fidelity SCAN or CCSD(T)), multi-fidelity MLIPs efficiently interpolate high-fidelity properties in otherwise underexplored regions of chemical space. One-hot encoding of fidelity labels enables a single network to simultaneously model multiple reference surfaces (Kim et al., 12 Sep 2024). “Δ-learning” combines a low-cost baseline potential (e.g., tight binding) with an ML-corrected difference to achieve CCSD(T)-like accuracy for periodic and vdW systems at affordable cost (Ikeda et al., 19 Aug 2025).
Ensemble Knowledge Distillation: For QC datasets without force labels, ensembles of “teacher” models predicting forces via autodifferentiation allow training of “student” MLIPs that combine accurate energies and reliable force predictions, improving performance and MD stability (Matin et al., 18 Mar 2025).

4. Accuracy–Cost Trade-offs and Benchmark Studies

Systematic benchmark studies comparing MLIP families (GAP, HDNNP, MTP, ACE, equivariant GNNs) reveal the following trends (Leimeroth et al., 5 May 2025):

Nonlinear ACE and equivariant GNNs (MACE, NequIP) form the Pareto front in accuracy–speed trade-offs. For metallic systems (Al-Cu-Zr), MACE achieves the lowest MAE at higher runtime; for Si-O (ionic-covalent bonding), NequIP outperforms others.
GPU acceleration provides up to 100 $\times$ speedup, making sophisticated MLIPs viable for long timescale and large-scale MD even compared to unaccelerated classical force fields.
User friendliness varies: ACE via the “pacemaker” code offers automated parameter tuning; others may require more complex setup.
Smooth potential energy surfaces and controlled extrapolation (tested via NVE conservation and phase stability) are as important as predictive accuracy, since noise or unphysical minima can induce simulation instability.

5. Applications: High-Throughput Screening, Specialized Materials, and Automated Pipelining

MLIPs are now central to high-throughput computational materials discovery and optimization:

High-Entropy Alloys (HEAs): MTP-based MLIPs screen and optimize multi-component HEAs (e.g., MoNbTaW, MoNbTaTiW), predict mechanical properties over hundreds of configurations, guide compositional tuning for targeted hardness–ductility, and align well with experimental Vickers hardness and bulk modulus values (Pandey et al., 2022).
Disordered Rocksalts and Battery Materials: Large-scale disordered rocksalts (LiTMO $_2$ with 11 elemental components) are screened using AENET (for energies) and MTP (for forces). Accurate voltages, elastic constants, and force fields enable high-throughput electrode discovery (Choyal et al., 2023).
Van der Waals Materials: Dispersion corrections (e.g., D3) incorporated into MLIPs are mandatory for accurate prediction of interlayer interactions in 2D heterobilayers. Message passing neural network MLIPs, after such corrections, yield errors in interlayer distance and band energies on par with the intrinsic uncertainty of DFT ( $\sim$ 0.1 Å for distance, 35 meV for bands) (Sauer et al., 8 Apr 2025).
Molecular Crystals: Foundational models (e.g., MACE-MP-0) fine-tuned on $\sim$ 200 structures achieve sub-chemical accuracy in sublimation enthalpies for systems such as X23, paracetamol, and squaric acid, capturing anharmonicity and nuclear quantum effects (Pia et al., 21 Feb 2025).
Automated Pipelines: MLIP construction is being fully automated (e.g., AMLP), integrating LLM agents for DFT input selection, literature mining, dataset creation, model training (with MACE fine-tuning), and robust post-training validation. MAEs of $\sim$ 1.7 meV/atom (energy) and $\sim$ 7 meV/Å (forces), sub-Å₀ reproduction of DFT geometries, and MD stability are attainable in such pipelines (Lahouari et al., 25 Sep 2025).

6. Extensions for Complex Physics: Long-Range Interactions and Data Compression

Expanding the range of physical phenomena accessible to MLIPs is an ongoing priority. Local environment representations are insufficient for long-range interactions (e.g., electrostatics, charge transfer). Two main strategies have emerged:

Explicit Physical Modeling: Integrating physics-based corrections (Coulomb, dispersion) or using explicit global charge equilibration within an equivariant GNN enables treatment of charge transfer, interfaces, and compositional heterogeneity (Maruf et al., 23 Mar 2025, Sauer et al., 8 Apr 2025). In NequIP-LR, global charge distribution is predicted via a charge equilibration scheme grounded in network-predicted atomic electronegativities and solved by minimizing an energy subject to the charge conservation constraint.
Compression Techniques: The parameter count in highly expressive MLIPs is a computational bottleneck. Low-rank matrix and tensor factorizations of radial parameters can achieve $\sim$ 50% compression in MTPs, reducing memory and computational cost with no accuracy penalty (Vorotnikov et al., 4 Sep 2025). Rank-adaptive algorithms further help avoid local minima during training.

7. Challenges, Limitations, and Future Directions

Despite their rapid adoption, MLIPs face intrinsic challenges:

Long-range Interaction Capture remains imperfect for electronic, dispersion, and charge effects in systems with substantial nonlocality. Modular augmentation and “hybrid” models, as well as new universal descriptors, are active research areas (Maruf et al., 23 Mar 2025).
Generalization is governed by the informativeness and diversity of the training data, fitting observables, and domain-specific loss functions. Recent theoretical work provides explicit error bounds connecting domain size and observable set to the generalization error, motivating best practices for training set construction (Ortner et al., 2022).
User Friendliness and Interoperability vary widely. Integrated libraries (such as MLIP (Brunken et al., 28 May 2025)) are addressing this by allowing flexible experimentation, fine-tuning, and deployment via modular APIs and high-performance backends (JAX-MD, ASE).
Universal Potentials and Foundational Models aspire to span elementally and structurally diverse chemistry, analogous to LLMs in NLP. Such models are pre-trained on massive, multi-fidelity datasets and fine-tuned for application-specific tasks (Jacobs et al., 12 Mar 2025).
Integration with Automated and AI-Driven Pipelines is accelerating, with LLM agents handling code selection and literature analysis—lowering the barrier for high-accuracy MLIP construction and validation (Lahouari et al., 25 Sep 2025).

Future development will focus on robust active learning, hybrid quantum–machine learning loops, tailored inclusion of nonlocal observables, efficient model compression, and full automation incorporating both physical and data-driven priors. The trajectory of the field suggests that MLIPs will soon become standard tools for accurate, scalable, and application-tailored atomistic simulation in condensed matter, chemistry, and biophysics.