ML Universal Interatomic Potentials

Updated 10 November 2025

MLUIP is a universal surrogate model trained on extensive quantum-mechanical datasets, approximating potential-energy surfaces using physics-informed GNN architectures.
It employs advanced training protocols with active learning, optimized hyperparameters, and uncertainty quantification to ensure robust cross-domain performance.
MLUIPs are applied in high-throughput materials screening, phase-field modeling, and property predictions, bridging DFT-level accuracy with computational efficiency.

Machine Learning Universal Interatomic Potentials (MLUIPs) are parameterized surrogate models, typically based on graph neural networks (GNNs) or advanced message-passing architectures, that are trained on large, diverse databases of quantum-mechanical reference data. These models can approximate the potential-energy surface (PES), atomic forces, and associated properties for a chemically and structurally broad range of atomic configurations—across multiple elements, phases, and defect topologies—without system-specific retraining. The advance from material- or element-specific potentials to universal models reflects a convergence of increased dataset diversity, expressive many-body graphical representations, and regularized training protocols enabling robust transferability and out-of-domain performance.

1. Mathematical Formulation and Physical Constraints

The central objective of an MLUIP is to construct a mapping: $E(\{\mathbf{r}_i, Z_i\}) = \sum_i E_i(\mathcal{G}_i)$ where $\mathbf{r}_i$ are atomic positions, $Z_i$ are atomic species, and $\mathcal{G}_i$ encodes the local atomic environment including element type, neighbor distances, and angular information within a cutoff. Forces follow by analytic differentiation: $\mathbf{F}_i = -\nabla_{\mathbf{r}_i} E$ Advanced formulations utilize equivariant GNN architectures (e.g., MACE, NequIP, EquiformerV2) to construct atomic feature tensors that transform correctly under SO(3) rotations, and embed geometric correlations through tensor products with spherical harmonics, Clebsch–Gordan contractions, and multi-body basis expansions (Shuang et al., 5 Feb 2025, Yu et al., 8 Mar 2024).

Physics-driven constraints, such as the Universal Equation of State (UEOS) for cohesive energy scaling,

$E^* = (r^* + 1)\exp(-r^*),\quad r^* = a(r - r_0)$

have been embedded to constrain model extrapolation, minimize required parameters, and ensure correct asymptotics under extreme compression/dilation (Hu et al., 11 Feb 2025). In certain contexts (e.g., phonons), restriction to a quadratic (harmonic) PES,

$E_{\mathrm{har}} = \frac{1}{2}\sum_{i, j}\sum_{\alpha, \beta} \Phi_{ij}^{\alpha\beta} u_i^\alpha u_j^\beta$

enables reparametrization in terms of interatomic force constants and efficient, physics-based uncertainty quantification (Lee et al., 17 Feb 2024).

2. Model Architectures, Training Protocols, and Hyperparameters

MLUIP construction relies on chemically and physically motivated architecture choices:

Model Family	Symmetry Type	Basis/Features	Typical Applications
MACE	E(3)-equivariant	Body-order polynomials	Energies, forces, phonons, defects
EquiformerV2	Equivariant Transformer	Spherical harmonics	Defects, high-entropy alloys
SevenNet	Extended NequIP	Spheres (l≤3), Bessel	Surfaces, molecules, cross-domain
DimeNet++	Directional GNN	Bond angles, Bessel	Harmonic phonons (harmonic PES)
SUS2-MLIP	Physically-constrained	UEOS, nonlinear basis	General materials, ultra-fast evaluation
ALIGNN	Line graph GNN	Bonds, angles	Extended bulk, defects

Training utilizes datasets exceeding $10^8$ DFT configurations (Shuang et al., 5 Feb 2025, Loew et al., 25 Aug 2025), encompassing relaxed structures, high-temperature/pressure MD, defected lattices, surfaces, and amorphous phases, sampled across a large element set (often >80). Loss functions combine energy, force, and optionally stress regression,

$\mathcal{L} = w_E \sum |E^{\text{pred}} - E^{\text{DFT}}|^2 + w_F \sum_i \|\mathbf{F}_i^{\text{pred}} - \mathbf{F}_i^{\text{DFT}}\|^2 + w_\sigma \sum |\sigma^{\text{pred}} - \sigma^{\text{DFT}}|^2$

with explicit balance (e.g., 1:10:100 for energy:forces:stress) (Liu et al., 9 Jun 2025, Liu et al., 27 Jun 2025).

Hyperparameters are systematically optimized: cutoff distances of 5–6 Å, atom/channel/edge embedding dimensions of 64–128, message-passing depths of 2–8, batch sizes up to 256 (for parallelism), and learning rates decaying from ~1e-3 to 1e-5.

Active-learning strategies (uncertainty-driven MD, data filtering, diversity maximization) prioritize maximally informative training samples, ensuring coverage of extrapolative and rare-event regions (Zaverkin et al., 2023). Loss-weighting, learning-rate scheduling, and regularization stabilize parameter convergence.

3. Performance and Benchmarking: Accuracy, Robustness, and Transferability

MLUIPs now exhibit mean absolute errors (MAEs) for energies <5–10 meV/atom and force errors <100 meV/Å across broad datasets (Shuang et al., 5 Feb 2025, Yu et al., 8 Mar 2024, Loew et al., 21 Dec 2024):

Formation energies: MACE, CHGNet, eqV2 achieve MAE ≈0.04–0.06 eV/atom for >1 million inorganic structures.
Phonons and band curvatures: MACE, SevenNet, MatterSim: phonon band MAE <20 meV, matching PBE↔PBEsol DFT spread (Loew et al., 21 Dec 2024, Lee et al., 17 Feb 2024).
Surface and defect energetics: EquiformerV2, MACE, CHGNet attain defect energy and grain boundary RMSE ≲5 meV/atom and force RMSE ≲100 meV/Å (GB-56, HEA10, alloys) (Shuang et al., 5 Feb 2025).
Thermodynamic and elastic properties: Bulk/shear/Young’s modulus predicted to MAE ~10–18 GPa (MACE, MatterSim, SevenNet); heat capacities and vibrational entropy within 2–5% (Gao et al., 27 Oct 2025, Loew et al., 21 Dec 2024).

Crucially, MLUIPs generalize across the full periodic table, arbitrary crystal phases (1–230 space groups), and an array of chemically distinct environments (oxides, chalcogenides, high-entropy alloys, 2D materials), provided such features appear in the training set.

Assessment protocols include energy/force regression against DFT, property-specific metrics (elastic constants, phonon frequencies, defect formation energies), computational throughput (e.g., O(10⁸⁾ atom·steps/sec (Hu et al., 11 Feb 2025)), and out-of-distribution (OOD) error analysis based on internal descriptor distances (Focassio et al., 7 Mar 2024).

4. Uncertainty Quantification, Limitations, and Physical Consistency

Uncertainty quantification (UQ) is essential for reliable deployment. Methods include:

Ensemble-based error propagation: Standard deviation across model ensembles tracks energy error magnitude (Shuang et al., 5 Feb 2025).
Gradient-based or Bayesian linear regression uncertainties: Used for targeted data sampling and UQ-aware MD/active learning (Zaverkin et al., 2023).
Physics-inspired UQ in constrained models: Within a harmonic MLUHIP, deviation from F = –Φu quantifies error in phononic properties; relative residuals |δF|/|F| directly correlate with errors in ω and F_vib (Lee et al., 17 Feb 2024).

Systematic limitations include:

Extrapolation outside sampled chemical/structural regimes: Performance degrades under high pressures (>100 GPa) or on low-coordination surface sites unless such configurations are reflected in training (Loew et al., 25 Aug 2025, Focassio et al., 7 Mar 2024).
Absence of explicit long-range physics (electrostatics, van der Waals): Most MLUIPs assume finite cutoffs, though hybrid corrections or charge prediction modules are emerging areas.
Anharmonicity: Harmonic models cannot capture temperature-dependent shift/broadening or thermal conductivity beyond the RTA.

Mitigation strategies encompass dataset enrichment, continual learning (replay buffers, elastic weight consolidation), and the development of physically motivated architectures (e.g., with embedded scaling laws or multi-modality) (Hu et al., 11 Feb 2025, Batatia et al., 29 Oct 2025).

5. Fine-Tuning, Active Learning, and Cross-Domain Adaptation

While universal MLUIPs attain high baseline accuracy, fine-tuning on small, task-specific datasets further reduces errors and adapts the foundation model to target applications (Liu et al., 9 Jun 2025, Liu et al., 27 Jun 2025). Best practices include:

Starting from a medium-to-large universal backbone (e.g., MACE-MP-0b), adding task-specific prediction heads, and retaining full backbone trainability.
Employing aggressive dataset curation (manual filtering, uncertainty-based selection) to isolate informative configurations, reducing RMSE by 2–3×.
Loss scaling (e.g., w_E:w_F:w_S = 1:10:100) and gradient clipping stabilize the fine-tuning process.
Multi-task and multi-domain pipelines (SevenNet-Omni, MACE-mh-omat) leverage selective regularization and domain-bridging samples (typically ~0.1% cross-domain recomputed configurations) to align potential-energy landscapes across functional and chemical domains, yielding state-of-the-art cross-domain adsorption-energy accuracy (≲0.06 eV) (Kim et al., 13 Oct 2025, Batatia et al., 29 Oct 2025).
Ongoing work focuses on active-learning loops and UQ-guided sample acquisition during MD or phase-field modeling for efficient dataset enrichment.

6. Applications, Impact, and Future Challenges

MLUIPs have rapidly deployed as out-of-the-box tools for:

High-throughput screening: Identification of defect-tolerant compounds, novel two-dimensional materials via vacancy energetics, and estimation of stability via convex hull analysis (Berger et al., 9 Apr 2025).
Phase-field modeling: Embedding MLUIPs within variational Gaussian frameworks enables atom-resolved thermodynamic mapping, including around grain boundaries and defects at experimental accuracy (Masuda et al., 16 Sep 2025).
Properties prediction: Thermodynamics, phonons, heat capacities, and elastic moduli are all accessible with DFT-comparable fidelity and several orders of magnitude computational speedup.

The challenge remains to achieve true universality—reliable, physical, and accurate predictions for all materials, environments, and property classes under arbitrary external fields. Current research directions emphasize enrichment of datasets with extreme-pressure, temperature, and low-dimensional configurations (Loew et al., 25 Aug 2025), robust cross-functional transfer (r²SCAN, hybrid DFT, beyond-PBE), and physically motivated model innovations (e.g., enforcing explicit charge and electrostatics, global equation-of-state scaling, or many-body, environment-adaptive kernels).

The field is converging toward an ecosystem of routinely validated, physically consistent, extensible MLUIP models, supported by open benchmarks and modular frameworks, providing the infrastructure for predictive materials, interface, and molecular simulation at scale.