NequIP: Neural Equivariant Interatomic Potentials

Updated 13 January 2026

Neural Equivariant Interatomic Potentials (NequIP) are E(3)-equivariant graph neural networks that learn atomic interactions by encoding rotation, translation, and reflection symmetries.
The architecture employs message-passing with tensor-product filtering using Clebsch–Gordan coefficients to integrate geometric information and improve prediction accuracy.
It demonstrates high data efficiency, scalability, and practical applicability in molecular dynamics, materials discovery, and complex phase simulations.

Neural Equivariant Interatomic Potentials (NequIP) are E(3)-equivariant, message-passing graph neural networks developed for learning interatomic potentials directly from ab initio data. By encoding rotational, translational, and reflection symmetry at a tensorial level, NequIP enables highly data-efficient, accurate modeling of atomic-scale interactions for molecular dynamics, materials discovery, and complex phase simulations.

1. Theoretical Foundations and Symmetry Principles

NequIP is formulated to respect the fundamental symmetries of physical systems under the Euclidean group E(3), encompassing translations, rotations, and inversion. The model enforces equivariance not just for scalar quantities but for all geometric tensors encountered in atomistic physics. Each atomic site in the graph carries a collection of features organized into irreducible representations (irreps) of SO(3), indexed by rotational order $\ell$ and parity $p$ , $h_i^{(\ell, p)} \in \mathbb{R}^{C_\ell \times (2\ell + 1)}$ . These features transform under rotation as $D^{(\ell)}(R) h_i^{(\ell, p)}$ , where $D^{(\ell)}$ is the Wigner D-matrix, ensuring proper transformation laws for all intermediate quantities and outputs (Batzner et al., 2021, Batatia et al., 2022, Lee et al., 2023).

This symmetry treatment enables strict rotation-, translation-, and reflection-invariance of predicted energies and equivariance of vector/tensor observables such as forces and virials, in contrast with prior scalar-invariant machine-learned interatomic potentials (MLIPs).

2. Core Message-Passing and Tensor-Product Design

At the heart of NequIP lies an E(3)-equivariant message-passing architecture. For each central atom $i$ and its neighbor $j$ within a cutoff $r_c$ , features are updated via:

Message construction:

$m_i^{(\ell)} = \sum_{j \in \mathcal{N}(i)} M^{(\ell)}\bigl(h_i^{(\ell)}, h_j^{(\ell)}, e_{ij}^{(\ell)}\bigr)$

Message filters:

The edge embedding $e_{ij}^{(\ell)}$ is expanded as:

$e_{ij}^{(\ell)}(r_{ij}) = \sum_{k, q} w_{kq}^{(\ell)} R_k(\|r_{ij}\|) Y_q(\hat{r}_{ij})$

using a learned radial basis $R_k$ (e.g., Bessel functions with linear combination), real spherical harmonics $Y_q$ , and weights $w_{kq}^{(\ell)}$ .

Tensor-product coupling:

The Clebsch–Gordan tensor product contracts the filter and neighbor features:

$m_{ij}^{(\ell)} = \sum_{k, q} w_{kq}^{(\ell)} R_k(\|r_{ij}\|) Y_q(\hat r_{ij}) \otimes h_j^{(\ell)}$

mapping pairs of irreps $(\ell_1, \ell_2)$ to all allowed output orders via standard CG coefficients.

Channel mixing and update:

Linear layers $U^{(\ell)}$ mix the respective channels within each irrep, followed by a "gated" equivariant nonlinearity (e.g., SiLU) and a residual update:

$h_i^{(\ell+1)} = U^{(\ell)}\bigl(h_i^{(\ell)}, m_i^{(\ell)}\bigr)$

This recurrence is stacked for $T$ interaction blocks (Batzner et al., 2021, Park et al., 2024, Tan et al., 22 Apr 2025, Batatia et al., 2022).

The design ensures all intermediate and final operations preserve E(3)-equivariance. Notably, the use of Clebsch–Gordan tensor products is both mathematically necessary and empirically superior, as confirmed by ablations (Batatia et al., 2022).

3. Training Procedures, Losses, and Data Efficiency

The total energy is decomposed into a sum over learned atomic energies: $E(\mathbf{R}, \mathbf{Z}) = \sum_i E_i$ where $E_i$ depends only on the local environment and features of atom $i$ . Forces and virials are obtained via automatic differentiation: $\mathbf{F}_i = -\nabla_{\mathbf{r}_i} E\,, \qquad \sigma = \frac{1}{V} \sum_i \mathbf{r}_i \otimes \mathbf{F}_i$ Training loss typically combines energies, forces, and sometimes stresses, usually weighted to prioritize force accuracy: $\mathcal{L} = w_E \|E^{\rm pred} - E^{\rm ref}\|^2 + w_F \sum_i \|\mathbf{F}_i^{\rm pred} - \mathbf{F}_i^{\rm ref}\|^2 + w_\sigma \|\sigma^{\rm pred} - \sigma^{\rm ref}\|^2$ Optimizers such as Adam or AMSGrad and learning rates of $10^{-3}$ -- $10^{-2}$ are typical. Early stopping, learning rate decay, and gradient norm clipping are employed for stability (Vita et al., 2023, Seth et al., 2024, Leimeroth et al., 5 May 2025).

NequIP exhibits high data efficiency: errors on MD-17 and periodic benchmarks decay as $\varepsilon_F \propto N_{\rm train}^{-\beta}$ , with $\beta \sim 0.4$ –0.6. Equivariant variants ( $\ell_{\max}>0$ ) consistently outperform invariant ( $\ell_{\max}=0$ ) models both in accuracy and in learning-curve slope (Batzner et al., 2021, Vita et al., 2023).

4. Hyperparameter Choices and Ablation Insights

Extensive ablation studies have quantitatively established optimal design principles:

Sufficient channel dimension $K \gtrsim 64$ is necessary for high accuracy.
Gated equivariant nonlinearity (e.g., SiLU) applied to the norm of the message is essential.
Residual self-connection at each layer is critical for stability and performance.
Message normalization by neighborhood size is important, especially at high temperature or for OOD generalization.
Data normalization via physically motivated one-body reference energies assists in reactive/long-range extrapolation (Batatia et al., 2022).

Optimal architectural parameters for robust, accurate models are: 4–5 interaction blocks, $l_{\max}=2$ , $r_c = 4$ –7.6 Å; 32–128 channels per $\ell$ ; 8–16 radial basis functions (Leimeroth et al., 5 May 2025, Park et al., 2024).

5. Computational Performance and Scalability

NequIP models, originally single-GPU and single-node, now achieve strong scaling and parallel efficiency suitable for massive datasets and production molecular dynamics:

SevenNet, a scalable NequIP derivative, attains $\gtrsim 80\%$ parallel efficiency in weak scaling to 32 GPUs, though strong scaling efficiency decreases when per-GPU atom counts drop below $5$–$10$k due to SM underutilization (Park et al., 2024).
PyTorch 2.0 integration and AOTI (Ahead-of-Time Inductor) compilation yield speedups up to $18\times$ for inference and $2.4$– $5.0\times$ for training compared to earlier TorchScript pipelines. Custom Triton kernels further accelerate the tensor-product bottleneck (Tan et al., 22 Apr 2025).
On modern A100/MI250X GPUs, large NequIP models run at $1$– $10~\mu$ s per atom per MD step, facilitating nanosecond-scale trajectories with system sizes $\sim 10^4$ – $10^5$ atoms (Leimeroth et al., 5 May 2025).
Efficient spatial decomposition strategies (e.g., halo-exchange with pipelined MPI communication) enable practical integration with LAMMPS and large-scale MD frameworks (Park et al., 2024).

A summary of strong/weak scaling for NequIP/SevenNet (SiO $_2$ ) is as follows:

Channels	Layers	$E_{\rm weak}(32)$	$E_{\rm strong}(32)$	Speed-up (32 GPUs)
4	2	0.67	13%	4.0
32	4	0.79	48%	15.4
64	5	0.84	48%	15.4

6. Benchmark Accuracy, Extrapolation, and Application Domains

NequIP models are Pareto-optimal in the accuracy-cost landscape for structurally and chemically complex systems. In systematic benchmarks:

Achieves MAEs as low as $2$–$5$ meV/atom and $1$–$6$ meV/Å for small molecules (MD-17; $l_{\max}=3$ ).
For Si–O, force MAE $0.052$ eV/Å at $450~\mu$ s/atom·step (CPU), competitive with or superior to MACE/ACE; for Al–Cu–Zr, force MAE $0.060$ eV/Å at $780~\mu$ s/atom·step (Leimeroth et al., 5 May 2025).
Equivariant NequIP generalizes well to OOD configurations—e.g., Laves phases or high-pressure phases not in the training set, with energy errors increasing sub-linearly with extrapolation distance.
NVE MD runs show strong energy conservation (drift $\lesssim 10^{-5}$  eV/atom up to $1800$ K for metallic systems), and self-termination (“crash”) at domain boundaries (high $P$ silica at $>120$  GPa).

Practical applications include:

Modeling Li $^+$ transport in amorphous LiPON, with MAE $6.1$ meV/atom and force MAE $13.2$ meV/Å, enabling diffusivity and conductivity calculations directly from MLIP-driven MD (Seth et al., 2024).
Thermal conductivity computation for phase-change materials (e.g., GeTe) via Green–Kubo MD, reproducing both crystalline/amorphous phase transitions and heat transport within $10$–$20$\% of DFT/experiment (Lee et al., 2023).
Large-scale, multi-species pretraining (e.g., SevenNet-0 on M3GNet data) yields transferable potentials with test MAE $24$ meV/atom (energy) and $0.067$ eV/Å (forces) across $>100,000$ atoms (Park et al., 2024).

7. Uncertainty Quantification, Variants, and Future Directions

The Bayesian NequIP extension introduces model uncertainty estimation via an equivariant BNN framework. Equipped with a novel adaptive-mass SGHMC sampler, the Bayesian version matches deterministic accuracy (MAE-F $\sim 0.13$ –$0.19$ kcal/mol·Å) and yields well-calibrated force/energy uncertainties for active learning and OOD detection (ROC AUC $>0.9$ ) (Rensmeyer et al., 2023).

Architectural ablations and the unified Multi-ACE formalism (Batatia et al., 2022) show NequIP as a chain of sparsified polynomial body-ordered models, generalizing ACE and interpretable as a limited-length chain of multi-body interactions. Simpler, more interpretable variants (e.g., BOTNet) can match NequIP performance with further architectural constraints.

Key future directions include:

Enhanced hardware utilization (asynchronous pipelining, kernel/communication fusion).
Integrated long-range physics (electrostatics via Ewald; truncated multipole expansions).
Modular plugin architectures for seamless deployment in high-performance MD codes.
Adaptive and uncertainty-aware active learning, leveraging Bayesian NequIP within large-scale data generation loops.

NequIP and its derivatives set the state-of-the-art for symmetric, data-efficient, and scalable machine-learned interatomic potentials, supporting both foundational studies and production modeling of materials, biomolecules, and complex soft/hard matter (Batzner et al., 2021, Tan et al., 22 Apr 2025, Leimeroth et al., 5 May 2025, Park et al., 2024, Batatia et al., 2022, Vita et al., 2023, Rensmeyer et al., 2023, Lee et al., 2023, Seth et al., 2024).