Gaussian Approximation Potentials

Updated 21 January 2026

Gaussian Approximation Potentials (GAPs) are a machine-learning framework that employs Gaussian process regression to interpolate the quantum-mechanical potential energy surface.
They leverage high-fidelity local atomic descriptors, such as SOAP, to capture invariant atomic environments and ensure accurate force and energy predictions.
Systematic database expansion, hyperparameter tuning, and sparsification techniques enable GAPs to efficiently simulate a wide range of materials including defected metals and complex amorphous phases.

A Gaussian Approximation Potential (GAP) is a non-parametric, machine-learning framework for generating interatomic potentials that interpolate the quantum-mechanical potential energy surface using Gaussian process regression. The method is applicable to a wide variety of materials systems, and enables ab initio-level accuracy over system sizes and timescales not tractable via direct quantum-mechanical simulation (Bartók, 2010, Szlachta, 2014, Klawohn et al., 2023). GAP combines high-fidelity local atomic descriptors (e.g., SOAP) with rigorous Bayesian regression over reference data, and is systematically improvable through database expansion, descriptor tuning, or regularization adjustment. It has been validated across numerous classes of materials and configurations, including defected metals, semiconductors, alloys, complex amorphous phases, and more (Szlachta, 2014, Babaei et al., 2019, Klawohn et al., 2023).

1. Mathematical Framework and Regression Formalism

The core assumption is an atomic energy decomposition,

$E = \sum_{i=1}^N \varepsilon(x_i),$

where $x_i$ is a descriptor encoding the local atomic environment of atom $i$ , and $\varepsilon(\cdot)$ is a function learned from data via Gaussian process regression (GPR) (Szlachta, 2014, Bartók, 2010). For a set of training environments $\{x_j\}$ , GPR is used to define a multivariate Gaussian prior over the unknown local energies, with

$\text{Cov}[\varepsilon(x_i),\varepsilon(x_j)] = k(x_i, x_j),$

where $k(\cdot,\cdot)$ is a positive-definite kernel function. Hyperparameters are typically tuned via maximum marginal likelihood or cross-validation.

Predictions for energies and (by differentiation) forces are made via

$\bar{\varepsilon}_* = k_*^T (K + \Lambda)^{-1} t,$

with $K$ the kernel matrix over training environments, $\Lambda$ the prior covariance (including regularization/noise), and $t$ the training label vector (energies and forces) (Bartók et al., 2015, Szlachta, 2014).

Force and stress labels enter by analytic differentiation of the kernel with respect to atomic positions, creating a coupled block-structured kernel (Szlachta, 2014, Klawohn et al., 2023).

2. Atomic Descriptors and the SOAP Representation

Critical to GAP accuracy is the use of local atomic descriptors that are invariant under translation, rotation, reflection, and permutation of like atoms, but also complete and differentiable (Szlachta, 2014, Bartók, 2010). Early descriptors included Steinhardt bond order parameters and the "bispectrum" of atomic densities (Bartók, 2010), but the Smooth Overlap of Atomic Positions (SOAP) kernel is now predominant.

In SOAP, the neighbor density around atom $i$ is modeled as a sum of Gaussians: $\rho_i(\mathbf{r}) = \sum_{j} \exp\left[-\frac{|\mathbf{r}-\mathbf{r}_{ij}|^2}{2\sigma_{\rm atom}^2}\right]$ expanded in spherical harmonics and orthogonal radial functions. The rotationally and permutationally invariant power spectrum

$p_{nn'l}^i = \sum_{m} [c_{nlm}^{(i)}]^* c_{n'lm}^{(i)}$

is constructed and typically flattened into a high-dimensional descriptor vector (Szlachta, 2014, Babaei et al., 2019). The similarity between two atomic environments is measured using the normalized dot-product kernel, often raised to a power $\zeta$ : $k(\mathbf{d}_i,\mathbf{d}_j) = \left( \frac{\mathbf{d}_i \cdot \mathbf{d}_j}{\|\mathbf{d}_i\|\|\mathbf{d}_j\|} \right)^{\zeta}.$

3. Training Data Construction and Protocols

GAP training relies on high-quality quantum-mechanical calculations, typically via density functional theory, covering the relevant space of atomic configurations (Bartók, 2010, Szlachta, 2014, Klawohn et al., 2023). For bcc tungsten, Szlachta et al. constructed a multistage training set: random bulk deformations, high-temperature phonon snapshots, monovacancy and surface structures, gamma-surface fault grids, dislocation cells, and vacancy–fault or dislocation–vacancy complexes, reaching $O(10^5)$ atomic environments (Szlachta, 2014).

Labels include total energies, atomic forces, and stresses, with label-specific noise estimates reflecting DFT convergence ( $\sim$ 1 meV/atom for energies, 0.1 eV/Å for forces, 0.01 eV/Å $^3$ for stresses). Augmented training—where new classes of defects or strain environments are iteratively added—ensures coverage of local environments critical for targeted simulations (e.g., dislocation cores, surfaces, or interfaces).

4. Hyperparameters, Sparsification, and Regularization

Because kernel regression scales cubically with the number of environments, GAP employs sparse GPR, selecting a set of "pseudo-inputs" (or "sparse points") $S\ll N$ , for which regression weights are optimized. Selection criteria include maximizing marginal likelihood, CUR decomposition, k-means, or farthest-point search in descriptor space (Szlachta, 2014, Klawohn et al., 2023).

Key hyperparameters include:

Noise levels per target ( $\sigma_E,\sigma_F,\sigma_V$ )
Kernel amplitude ( $\sim$ 1 eV for metals)
SOAP Gaussian width ( $\sigma_{\rm atom} \sim 0.5$ Å)
Cutoff radii (typically $4-6$ Å)
Radial and angular basis truncation ( $n_{\max}, \ell_{\max}$ up to $10-12$)
Kernel exponent ( $\zeta=4$ or $6$)

Regularization is imposed via the noise assigned to labels and, in practice, via the number of sparse points.

The combined linear system for regression weights is

$\begin{bmatrix} K_{EE} & K_{EF} \ K_{FE} & K_{FF} + \sigma_F^2 I \end{bmatrix} \alpha = y,$

where the blocks denote couplings between energy and force labels via kernel derivatives (Szlachta, 2014).

5. Quantitative Validation and Benchmarks

Performance is tested against independent DFT data for physical properties relevant to the target system. For tungsten (Szlachta, 2014):

Elastic constants: SOAP-GAP models reproduce all three constants within <1% of DFT.
Phonons: RMS error in phonon frequencies 0.12 THz (SOAP-GAP), vs 0.22 THz (bispectrum-GAP), ~0.2 THz underprediction (Finnis–Sinclair).
Vacancy formation energies and surface energies: agreement with DFT to within 0.01 eV.
Gamma-surface and screw-dislocation core structures: GAP reproduces DFT symmetry and Peierls barrier with <0.05 eV error (1.07 eV/b vs. DFT 1.1 eV/b).
System size scalability: mobile dislocation simulations in cells of up to $10^5$ atoms, matching DFT energetics within meV/atom.

In all cases, the SOAP-GAP reduces the RMS force errors on complex defects (e.g., dislocation cores) by an order of magnitude relative to classical potentials or bispectrum-only descriptors.

6. Limitations and Challenges

Despite systematic improvability, several intrinsic and practical limitations exist (Szlachta, 2014, Klawohn et al., 2023):

Bispectrum descriptors converge non-monotonically with angular cutoff and are prone to high-frequency noise; SOAP avoids this by using Gaussians in the neighbor density.
Interpolation is reliable within the training manifold; extrapolation is unphysical unless the training database directly samples the relevant chemistry or topology.
DFT costs limit the diversity and size of training data; active learning and regression variance estimates can help fill important gaps (e.g., uncommon defect environments).
High-dimensional descriptors such as bond-based SOAP become computationally expensive for high-coordination environments; these are more suitable for low-coordination applications.
Explicit modeling of long-range effects (electrostatics, magnetism) requires augmentation, as current implementations are typically local and non-magnetic.
Transferability to multi-component alloys and handling of non-collinear magnetic order remains a frontier topic.

7. Broader Impact and Applications

GAP models have been established as highly accurate, datadriven interatomic potentials for a broad swath of materials—metals, alloys, semiconductors, ionic solids, molecular materials, and coarse-grained biomolecules—enabling simulations of phase stability, defect energetics, lattice dynamics, and mechanical response at scales beyond traditional DFT (Szlachta, 2014, Babaei et al., 2019, Klawohn et al., 2023, John, 2016). Their systematic construction protocol—careful DFT sampling, robust local descriptors, kernel-based regression, and rigorously validated prediction—forms the foundation of machine-learned atomistic modeling in contemporary computational materials science.