Papers
Topics
Authors
Recent
Search
2000 character limit reached

NequIP: E(3)-Equivariant Interatomic Potentials

Updated 2 May 2026
  • NequIP is an E(3)-equivariant graph neural network that uses tensor-valued features and message-passing operations to accurately predict interatomic potentials.
  • It exhibits state-of-the-art data efficiency by requiring 1–3 orders of magnitude fewer training points, making high-fidelity simulations tractable.
  • Its design guarantees global invariance and supports extensions like Bayesian models and long-range physics modules for enhanced predictive capability.

Neural Equivariant Interatomic Potentials (NequIP) are E(3)-equivariant graph neural networks (GNNs) for learning interatomic potentials from ab-initio data. NequIP models predict total system energies, forces, and (optionally) stress tensors with a level of accuracy and transferability suitable for molecular and materials simulations ranging from molecules and glasses to extended solids and complex biological systems. The central innovation is strict equivariance under all rigid motions—rotations, translations, and permutations of like atoms—at every layer, realized by tensor-valued features and message-passing operations built from irreducible representations of SO(3). NequIP achieves state-of-the-art data efficiency and fidelity, outperforming invariant and lower-order MLIP architectures in both small-data and high-complexity regimes (Batzner et al., 2021, Leimeroth et al., 5 May 2025).

1. E(3)-Equivariant Architecture

Each atom ii is associated with a collection of feature tensors hi(ℓ)∈Rdℓ×(2ℓ+1)h_i^{(\ell)} \in \mathbb{R}^{d_\ell \times (2\ell+1)}, for ranks ℓ=0,1,2,…,ℓmax\ell=0,1,2,\ldots,\ell_\text{max}, which transform as spherical harmonics of order ℓ\ell under three-dimensional rotations. Atomic environments are encoded by expanding neighbor-atom features using a tensor product of learned radial basis functions Rn(rij)R_n(r_{ij}) (typically Gaussian or Bessel), real spherical harmonics Yℓm(r^ij)Y_\ell^m(\hat{r}_{ij}), and Clebsch–Gordan coupling between angular orders:

mij(ℓ)=∑ℓ1,ℓ2W(ℓ;ℓ1,ℓ2)[Yℓ1(r^ij)⊗hj(ℓ2)]ℓRℓ(∥rij∥)m_{ij}^{(\ell)} = \sum_{\ell_1, \ell_2} W^{(\ell;\ell_1,\ell_2)} \left[ Y^{\ell_1}(\hat{r}_{ij}) \otimes h_j^{(\ell_2)} \right]_{\ell} R_{\ell}(\|r_{ij}\|)

with W(â„“;â„“1,â„“2)W^{(\ell;\ell_1,\ell_2)} parameterized as small fully-connected networks. For each atom, multiple message-passing layers successively update these features:

hi(t+1),â„“=hi(t),â„“+Nonlin(W(t)mi(t),â„“)h_i^{(t+1),\ell} = h_i^{(t),\ell} + \mathrm{Nonlin}\left( W^{(t)} m_i^{(t),\ell} \right)

Residual self-connections and channel-wise nonlinearities (e.g. gated SiLU) are empirically critical for performance and extrapolation (Batatia et al., 2022). Message normalization by local neighbor count further stabilizes training.

After LL layers, only the hi(ℓ)∈Rdℓ×(2ℓ+1)h_i^{(\ell)} \in \mathbb{R}^{d_\ell \times (2\ell+1)}0 scalar block contributes to the per-atom energy read-out:

hi(ℓ)∈Rdℓ×(2ℓ+1)h_i^{(\ell)} \in \mathbb{R}^{d_\ell \times (2\ell+1)}1

hi(ℓ)∈Rdℓ×(2ℓ+1)h_i^{(\ell)} \in \mathbb{R}^{d_\ell \times (2\ell+1)}2

Global invariance to translations and rotations (and permutation of identical atoms) is guaranteed by symmetry properties of the design (Batzner et al., 2021, Istas et al., 2024).

2. Training Protocols and Data Efficiency

Training is executed with combined losses on energy, forces, and occasionally stresses, typically:

hi(ℓ)∈Rdℓ×(2ℓ+1)h_i^{(\ell)} \in \mathbb{R}^{d_\ell \times (2\ell+1)}3

Automatic differentiation of the total energy yields strictly conservative, equivariant force fields. NequIP exhibits high data efficiency, attaining accurate force errors and learning-curve exponents with 1–3 orders of magnitude fewer training points than invariant neural or kernel models (e.g., sub-10 meV/Å MAEs on MD-17 with 1,000 frames) (Batzner et al., 2021, Leimeroth et al., 5 May 2025). This permits fitting to high-level quantum data, such as coupled-cluster, making large-scale, high-fidelity molecular dynamics tractable.

Hyperparameters are typically:

  • hi(â„“)∈Rdℓ×(2â„“+1)h_i^{(\ell)} \in \mathbb{R}^{d_\ell \times (2\ell+1)}4–hi(â„“)∈Rdℓ×(2â„“+1)h_i^{(\ell)} \in \mathbb{R}^{d_\ell \times (2\ell+1)}5 message-passing layers
  • hi(â„“)∈Rdℓ×(2â„“+1)h_i^{(\ell)} \in \mathbb{R}^{d_\ell \times (2\ell+1)}6 or hi(â„“)∈Rdℓ×(2â„“+1)h_i^{(\ell)} \in \mathbb{R}^{d_\ell \times (2\ell+1)}7
  • Feature width per irrep 32–128
  • Radial cutoff hi(â„“)∈Rdℓ×(2â„“+1)h_i^{(\ell)} \in \mathbb{R}^{d_\ell \times (2\ell+1)}8–hi(â„“)∈Rdℓ×(2â„“+1)h_i^{(\ell)} \in \mathbb{R}^{d_\ell \times (2\ell+1)}9 Ã…, radial basis size â„“=0,1,2,…,â„“max\ell=0,1,2,\ldots,\ell_\text{max}08–32
  • Optimizers: Adam or AMSGrad

3. Extensions: Bayesian, Foundation, and Specialized Models

Bayesian NequIP extends the deterministic architecture to a full Bayesian neural network (BNN) with Gaussian priors on all parameters:

ℓ=0,1,2,…,ℓmax\ell=0,1,2,\ldots,\ell_\text{max}1

and a factorizable Gaussian likelihood over energies and forces. An adaptive-step SGHMC algorithm with variable mass matrix (AMSGrad-inspired) permits scalable posterior sampling. This yields calibrated uncertainty quantification, outlier detection, and improved mean log-likelihoods versus MC dropout and deterministic models, with only minor computational overhead (Rensmeyer et al., 2023).

Specialized derivatives such as PANIP (for non-covalent interactions in proteins) employ ensemble NequIP models with fragmentwise pretraining, cross-validation aggregation, and automated active learning (MFAL) to distill representative datasets from millions of candidate dimers. PANIP achieves ℓ=0,1,2,…,ℓmax\ell=0,1,2,\ldots,\ell_\text{max}20.2 kcal/mol MAEs on challenging out-of-distribution test sets and excels at pose ranking in docking tasks (Zeng et al., 14 Jan 2026). Compact NequIP variants (e.g. Nequix) further optimize architecture, adding equivariant RMSNorm, dropping unnecessary irreps, and utilizing advanced optimizers (Muon) to reduce parameter count and training time with mild degradation of accuracy (Koker et al., 22 Aug 2025).

4. Integration with Long-Range Physics and Electrostatics

Baseline NequIP is a strictly short-range model (cutoff ℓ=0,1,2,…,ℓmax\ell=0,1,2,\ldots,\ell_\text{max}3–ℓ=0,1,2,…,ℓmax\ell=0,1,2,\ldots,\ell_\text{max}4 Å), which cannot represent ℓ=0,1,2,…,ℓmax\ell=0,1,2,\ldots,\ell_\text{max}5-decaying electrostatics. The Latent Ewald Summation (LES) module augments NequIP by predicting latent atomic charges from the final ℓ=0,1,2,…,ℓmax\ell=0,1,2,\ldots,\ell_\text{max}6 features, then computing an Ewald-summed long-range energy:

ℓ=0,1,2,…,ℓmax\ell=0,1,2,\ldots,\ell_\text{max}7

where ℓ=0,1,2,…,ℓmax\ell=0,1,2,\ldots,\ell_\text{max}8. This enables NequIP to capture long-range electrostatics, recover Born effective charges and polarization directly from energy/force data, and substantially improve force errors and physical predictions in complex systems (e.g. water, peptide, metal-oxide surfaces), with minimal computational overhead (Kim et al., 18 Jul 2025).

5. Computational Performance, Parallelization, and MD Integration

NequIP is efficiently implemented in PyTorch with CPU and GPU backends. GPU acceleration yields overheads per atom per MD step (ℓ=0,1,2,…,ℓmax\ell=0,1,2,\ldots,\ell_\text{max}90.00007 ms for Si–O on A100) that are competitive with or faster than classical EAM on CPUs. Multi-node distributed parallelism is available in recent NequIP and compatible frameworks via distributed data parallelism (DDP) and custom kernel fusion (PyTorch Inductor, Triton); network-intensive operations such as tensor-product computation have been further accelerated with custom CUDA kernels (Tan et al., 22 Apr 2025).

Domain-decomposition implementations support large-scale MD via spatial partitioning and message-passing of node features ("ghost" atoms), as in the SevenNet variant. Weak-scaling efficiencies above 80% are demonstrated up to 32 GPUs; systems of â„“\ell0 atoms are accessible at near-ideal scaling, provided sufficient atoms per device (Park et al., 2024).

Direct integration with LAMMPS is provided via plugins supporting single-GPU runs; multi-GPU/MPI is handled by extended implementations or external wrappers.

6. Application Domains and Benchmarks

NequIP has achieved state-of-the-art benchmarks across a spectrum of chemistry and materials challenges:

  • Small molecules (MD-17/revMD-17, molecules@CCSD(T)): force MAE â„“\ell1 meV/Ã… with 1,000 frames, outperforming other MLIPs and kernel methods (Batzner et al., 2021).
  • Liquids and glasses (e.g. water, Liâ„“\ell2Pâ„“\ell3Oâ„“\ell4): accurate radial/angular distributions and phase properties from â„“\ell5 frames.
  • Metals and transition metals: in the TM23 d-block set, NequIP attains 2–3â„“\ell6 lower force and energy errors than leading kernel-based ACE/FLARE models, especially for late transition metals (Owen et al., 2023).
  • Amorphous solid electrolytes (LiPON): sub-14 meV/Ã… force errors and â„“\ell7–ℓ\ell8 speedups over DFT/AIMD, enabling nanosecond, â„“\ell9-atom trajectories and direct calculation of ion transport at interfaces (Seth et al., 2024).
  • Hydrogen under pressure: phase transition analysis (LLPT) via finite-size scaling; 170 meV/Ã… force MAE; Rn(rij)R_n(r_{ij})0 speedup relative to AIMD, permitting system sizes and timescales inaccessible to DFT (Istas et al., 2024).
  • Biophysical and biochemical systems: non-covalent proteins and ligand binding energies at quantum accuracy (Zeng et al., 14 Jan 2026).

Comparative studies confirm NequIP as Pareto optimal in the accuracy-cost trade-off for many complex systems, sometimes outperformed in specific regimes by MACE or Allegro, or in parameter efficiency by heavy optimizations such as Nequix (Leimeroth et al., 5 May 2025, Koker et al., 22 Aug 2025).

7. Design Principles, Variants, and Limitations

Core design choices critical for NequIP's accuracy include:

  • E(3) equivariance (at least Rn(rij)R_n(r_{ij})1), enforcing explicit directionality and angular resolution.
  • Residual self-connection updates at each layer.
  • Channel- and message normalization for convergence and extrapolation.
  • Final nonlinear (MLP) readout on invariant scalar features.

Variants such as BOTNet demonstrate that almost all nonlinearity can be relegated to the final layer while maintaining accuracy on key benchmarks, corresponding to a body-ordered linear expansion plus a nonlinear tail (Batatia et al., 2022). The equivalence between layered equivariant GNNs and multi-layer ACE polynomial models establishes connections to kernel-based and polynomial MLIPs.

Limitations include short-range nature of native NequIP, which must be augmented (e.g. via LES) for systems with significant long-range physics; potential instability in stress predictions if trained without carefully regularized stress terms; and computational cost per step, which is higher than for kernel ACE or classical force fields, though vastly lower than ab-initio methods. Scaling to thousands of GPU nodes is feasible but requires specialized libraries and careful workload balancing (Tan et al., 22 Apr 2025, Park et al., 2024).


In summary, NequIP represents a rigorously E(3)-equivariant, tensorial, message-passing neural architecture that combines strong data efficiency, transferability, and physical guarantees, validated across a wide domain of chemistry, materials, and biophysics, with numerous extensions, optimized implementations, and integration frameworks now established in the MLIP ecosystem (Batzner et al., 2021, Leimeroth et al., 5 May 2025, Rensmeyer et al., 2023, Batatia et al., 2022, Tan et al., 22 Apr 2025, Kim et al., 18 Jul 2025, Seth et al., 2024, Istas et al., 2024, Zeng et al., 14 Jan 2026, Koker et al., 22 Aug 2025, Park et al., 2024, Owen et al., 2023).

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to NequIP.