Graph Atomic Cluster Expansion (GRACE)

Updated 4 July 2026

Graph Atomic Cluster Expansion (GRACE) is a graph-based generalization of ACE, expanding admissible clusters from star-shaped to tree structures to capture semilocal atomic interactions.
It integrates a formally complete many-body basis with recursive tensor factorization and message-passing, offering enhanced accuracy and computational efficiency.
Benchmark studies indicate that GRACE models achieve state-of-the-art performance in energy, force, and stress predictions across diverse materials and chemical systems.

Searching arXiv for GRACE and closely related ACE/graph ACE papers to ground the article in current literature. I’m checking arXiv for the latest GRACE-related records and adjacent ACE work. Graph Atomic Cluster Expansion (GRACE) is a graph-based generalization of the Atomic Cluster Expansion (ACE) for machine-learning interatomic potentials and related many-particle representations. In its modern usage, GRACE denotes both a mathematical framework and a family of universal machine learning interatomic potentials (MLIPs): it extends ACE from local, atom-centered star clusters to admissible graph-based clusters, especially trees, and evaluates the resulting basis recursively in a message-passing style. The framework is presented as combining a formally complete basis for interatomic interactions with the computational and inductive advantages of graph-based models, and the 2025 foundational-potential work positions it as a substrate for universal, pretrained, and adaptable atomistic simulation across the periodic table (Bochkarev et al., 2023, Lysogorskiy et al., 25 Aug 2025).

1. Conceptual definition and historical development

GRACE was introduced to address a limitation of standard ACE. In the standard ACE setting, a property is decomposed into atomic contributions,

$E=\sum_i E_i,$

and each $E_i$ is expanded in basis functions centered on atom $i$ . This yields a local and complete description for atom-centered environments, but it is restricted to star-shaped clusters in which all edges attach directly to a single root atom. The 2023 GRACE formulation enlarges this basis philosophy from local atom-centered stars to global graph-based clusters, particularly trees, in order to represent semilocal interactions efficiently and in physically and chemically transparent form (Bochkarev et al., 2023).

The motivation given for this extension is that several important effects are not naturally star-shaped. The 2023 paper lists interactions appearing after diagonalization of Hamiltonians, relaxation or self-consistency effects, longer-ranged correlation or dispersion contributions, and multi-hop interactions in graph-like electronic structure as examples of semilocal phenomena that are better matched to connected graph motifs beyond stars (Bochkarev et al., 2023). In this sense, GRACE is not merely a change of neural architecture; it is a redefinition of the admissible cluster topologies in the underlying expansion.

The 2025 foundational-potential paper develops this idea into a universal MLIP program. There, GRACE is presented as a framework for building universal machine learning interatomic potentials that are accurate, efficient, and adaptable across the periodic table. The key claim is that a single potential should cover many elements and chemistries without requiring retraining from scratch when a new element is introduced. Rather than brute-force enumeration of many-body interactions, GRACE embeds chemistry into a low-dimensional learned space and uses graph-structured ACE expansions to represent the resulting atomic environments efficiently (Lysogorskiy et al., 25 Aug 2025).

A central distinction from other model classes is stated explicitly. Compared with typical graph neural networks, GRACE is described as providing a complete basis for atomic environments rather than an empirically chosen latent message-passing scheme. Compared with earlier ACE models, it extends ACE to tree graphs and recursive graph evaluations, thereby allowing message passing and semilocal interactions within the ACE formalism. The paper further states that, because the basis is complete and the chemical embeddings are learned, GRACE can in principle rationalize other graph-network architectures while enabling sparse, low-rank, recursively evaluable representations (Lysogorskiy et al., 25 Aug 2025).

2. Mathematical basis and graph construction

The 2023 GRACE paper formulates the expansion at the level of the full configuration

$\boldsymbol{\sigma}=(\mathbf r_1,\mu_1,\mathbf r_2,\mu_2,\dots,\mathbf r_N,\mu_N),$

where $\mathbf r_i$ are positions and $\mu_i$ denotes species and possibly other on-site variables such as magnetic moments or charges (Bochkarev et al., 2023). Each atom carries single-particle basis functions

$\phi_{iu}(\Delta \mathbf r)=\phi_{iu}(\mathbf r-\mathbf r_i),$

and chemistry or on-site variables are encoded by

$\chi_{i\kappa}(\mu).$

For rotational equivariance, the basis uses

$\phi_{iu}(\mathbf r)=R_{nl}(r)Y_{lm}(\hat{\mathbf r}), \qquad u=nlm.$

GRACE then forms products of one-particle factors over a graph or configuration,

$\Phi_\alpha = \prod_{k=1}^{N}\phi_{i_k u_k}(\mathbf r_k)\, \prod_{k'=1}^{N}\chi_{k'\kappa_{k'}(\mu_{k'}),$

and states that for admissible configurations the cluster basis is complete: $E_i$ 0 A property can therefore be expanded formally as

$E_i$ 1

This is the GRACE analogue of ACE completeness, but now for admissible graphs rather than only atom-centered stars (Bochkarev et al., 2023).

The graph interpretation is explicit. Nodes correspond to atoms, directed edges $E_i$ 2 correspond to basis factors $E_i$ 3, and on-site labels correspond to $E_i$ 4. The admissible graphs satisfy several properties: disconnected graph fragments are excluded, edges longer than the cutoff contribute zero, and each node has at most one incoming edge. From these properties, the paper concludes that admissible graphs are trees: connected, acyclic graphs with $E_i$ 5 edges for $E_i$ 6 nodes (Bochkarev et al., 2023).

This graph enlargement strictly generalizes local ACE. Local ACE is recovered as the subset of GRACE containing only star graphs. The two coincide up to three-body interactions in the usual setting and differ from four-body order onward. The 2023 paper emphasizes that GRACE therefore adds genuinely new tree topologies such as $E_i$ 7, $E_i$ 8, and $E_i$ 9, beyond the stars $i$ 0 and $i$ 1. The significance assigned to this extension is geometric sensitivity: in a star graph, all geometry is funneled through one root atom, whereas in a tree the relative arrangement can be distributed across intermediate nodes and can therefore resolve semilocal structure with less angular compression (Bochkarev et al., 2023).

The symmetry treatment follows ACE. Translation invariance is automatic because only relative vectors appear. Rotation and inversion are handled using spherical harmonics and generalized Clebsch–Gordan couplings. The 2023 formulation writes the symmetry-adapted coefficients in the same generalized coupling machinery used in ACE and equivariant message-passing models, thereby preserving translations, rotations, inversion, and permutation of identical atoms (Bochkarev et al., 2023).

3. Recursive evaluation and relation to message passing

A defining feature of GRACE is that the graph expansion is not only broader than ACE but also recursively evaluable. The 2023 paper states that the full graph expansion is very large and simplifies it through tensor decomposition of the expansion coefficients. A generic tensor decomposition is written as

$i$ 2

or more generally with coupled blocks of indices. In graph ACE, the coefficients are decomposed by star substructures of trees, which turns the graph expansion into a recursion (Bochkarev et al., 2023).

The paper introduces a “dandelion” tree, described as a regular tree from which smaller trees are obtained by repeatedly removing florets. This organization yields a layered evaluation procedure. At the deepest layer $i$ 3, one defines an ACE-like atomic base

$i$ 4

then applies a local ACE-like polynomial to produce layer messages, and repeats recursively: $i$ 5

$i$ 6

ending with

$i$ 7

The interpretation given is direct: compute neighbor-dependent atomic bases, apply a local ACE polynomial on each layer, pass the resulting messages to the next layer, and repeat until the root layer is reached (Bochkarev et al., 2023).

This recursive form is the basis for the paper’s most prominent conceptual claim: message-passing models can be understood as simplified recursive evaluations of graph ACE. The 2023 paper places NequIP, ml-ACE, Multi-ACE or MACE, and Allegro within the GRACE design space. In that account, these methods are not merely message-passing architectures in an abstract machine-learning sense; they are recursive evaluations of graph ACE under specific tensor factorization choices (Bochkarev et al., 2023).

The scaling result is similarly explicit. Although a naive $i$ 8-layer graph ACE evaluation could appear to scale like $i$ 9 with neighbor count $\boldsymbol{\sigma}=(\mathbf r_1,\mu_1,\mathbf r_2,\mu_2,\dots,\mathbf r_N,\mu_N),$ 0, the 2023 paper derives explicit algorithms showing that both energy and forces can be evaluated with cost linear in neighbor count and linear in layer depth,

$\boldsymbol{\sigma}=(\mathbf r_1,\mu_1,\mathbf r_2,\mu_2,\dots,\mathbf r_N,\mu_N),$ 1

The appendix gives an explicit force recursion,

$\boldsymbol{\sigma}=(\mathbf r_1,\mu_1,\mathbf r_2,\mu_2,\dots,\mathbf r_N,\mu_N),$ 2

and then reorganizes the sums into layer-wise prefactors so that total force cost remains linear in both $\boldsymbol{\sigma}=(\mathbf r_1,\mu_1,\mathbf r_2,\mu_2,\dots,\mathbf r_N,\mu_N),$ 3 and $\boldsymbol{\sigma}=(\mathbf r_1,\mu_1,\mathbf r_2,\mu_2,\dots,\mathbf r_N,\mu_N),$ 4 (Bochkarev et al., 2023).

A plausible implication is that GRACE’s relation to message passing is strongest at the level of evaluation strategy rather than only at the level of representational analogy. The papers consistently frame GRACE as deriving iterative equivariant aggregation from a basis-expansion formalism rather than postulating it as a black-box network design (Bochkarev et al., 2023, Lysogorskiy et al., 25 Aug 2025).

4. Foundational MLIP architecture and training protocol

The 2025 GRACE paper instantiates the framework as a family of universal interatomic potentials implemented in the grace-tensorpotential package. The total energy is decomposed into atomic contributions, and each atomic contribution is built from invariant descriptors derived from recursively constructed equivariant basis functions. According to the supplementary description, energy is obtained by first constructing basis functions up to fourth product order using sparse coupling operations, then combining them into messages in the multi-layer version, and finally feeding the invariant densities into a readout network (Lysogorskiy et al., 25 Aug 2025).

The training objective is

$\boldsymbol{\sigma}=(\mathbf r_1,\mu_1,\mathbf r_2,\mu_2,\dots,\mathbf r_N,\mu_N),$ 5

where $\boldsymbol{\sigma}=(\mathbf r_1,\mu_1,\mathbf r_2,\mu_2,\dots,\mathbf r_N,\mu_N),$ 6, $\boldsymbol{\sigma}=(\mathbf r_1,\mu_1,\mathbf r_2,\mu_2,\dots,\mathbf r_N,\mu_N),$ 7, and $\boldsymbol{\sigma}=(\mathbf r_1,\mu_1,\mathbf r_2,\mu_2,\dots,\mathbf r_N,\mu_N),$ 8 are energy-per-atom, force-component, and stress-component losses. The paper uses a Huber loss with $\boldsymbol{\sigma}=(\mathbf r_1,\mu_1,\mathbf r_2,\mu_2,\dots,\mathbf r_N,\mu_N),$ 9 for all terms. Forces are obtained as exact gradients of the total energy with respect to atomic positions, so the model is conservative by construction (Lysogorskiy et al., 25 Aug 2025).

The architecture choices reported in the paper are concrete: a cutoff radius of $\mathbf r_i$ 0 Å, Chebyshev radial basis functions, spherical harmonics up to $\mathbf r_i$ 1, $\mathbf r_i$ 2 chemical elements embedded into a $\mathbf r_i$ 3-dimensional learned chemical space, basis functions recursively built up to fourth product order, and, in the two-layer models, equivariant functions passed between layers as messages (Lysogorskiy et al., 25 Aug 2025).

The model families are organized as one-layer and two-layer architectures, each in small, medium, and large variants. The one-layer models are described as “ACE star-graphs with direct interactions,” while the two-layer models add semilocal interactions through equivariant message passing. The supplementary table reports that the one-layer models have around $\mathbf r_i$ 4M to $\mathbf r_i$ 5M invariant functions and around $\mathbf r_i$ 6M to $\mathbf r_i$ 7M parameters, while the two-layer models reach roughly $\mathbf r_i$ 8M to $\mathbf r_i$ 9M parameters depending on size. For the two-layer models, the first layer contains thousands of equivariant functions and the second layer several thousand invariant functions (Lysogorskiy et al., 25 Aug 2025).

Training is staged and dataset-driven. The primary dataset is OMat24, described as the largest publicly available materials property dataset used in the paper, with $\mu_i$ 0 million DFT calculations. It is mostly computed with VASP using GGA-PBE and includes Hubbard $\mu_i$ 1 corrections for some oxides and fluorides. The paper states that OMat24 is broader than Alexandria and MPTraj because it includes many non-equilibrium structures generated via Boltzmann sampling, AIMD, and relaxations (Lysogorskiy et al., 25 Aug 2025).

The staged optimization protocol is summarized below.

Stage	Data	Loss weighting
OMAT-base	OMat24	$\mu_i$ 2
OAM fine-tuning	MPTraj + sAlex	$\mu_i$ 3
ft-E fine-tuning	OMat24	heavier energy emphasis

Optimization uses Adam with cosine learning-rate decay, starting at $\mu_i$ 4 and decaying to $\mu_i$ 5; fine-tuning uses a constant $\mu_i$ 6. Training is done on $\mu_i$ 7 H100 GPUs, with batches organized by number of bonds rather than structures to improve throughput. The reported full training cost ranges from about $\mu_i$ 8 to $\mu_i$ 9 GPU-hours depending on model size (Lysogorskiy et al., 25 Aug 2025).

5. Benchmarks, efficiency, and empirical scope

The 2025 paper evaluates GRACE on a deliberately broad benchmark suite intended to test the “foundational” character of the potentials: MatBench Discovery for formation energy and stable-structure identification, thermal conductivity via $\phi_{iu}(\Delta \mathbf r)=\phi_{iu}(\mathbf r-\mathbf r_i),$ 0, elastic constants, grain boundary energies, surface energies, point defects, and long-time molecular-dynamics stability on molten FLiBe (Lysogorskiy et al., 25 Aug 2025).

On MatBench Discovery, geometry optimization is performed over the WBM candidate structure set and stable-structure identification is measured using F1 score. The paper states that the GRACE models, especially the OAM-fine-tuned versions, lie on or near the Pareto front of accuracy versus runtime. A specific example reported in the table is GRACE-2L-OAM-L, which achieves an F1 score of $\phi_{iu}(\Delta \mathbf r)=\phi_{iu}(\mathbf r-\mathbf r_i),$ 1 with $\phi_{iu}(\Delta \mathbf r)=\phi_{iu}(\mathbf r-\mathbf r_i),$ 2 and runtime of $\phi_{iu}(\Delta \mathbf r)=\phi_{iu}(\mathbf r-\mathbf r_i),$ 3s/atom in ASE and $\phi_{iu}(\Delta \mathbf r)=\phi_{iu}(\mathbf r-\mathbf r_i),$ 4s/atom in LAMMPS. The smaller GRACE variants trade some accuracy for speed: GRACE-1L-OMAT runs in about $\phi_{iu}(\Delta \mathbf r)=\phi_{iu}(\mathbf r-\mathbf r_i),$ 5s/atom in ASE and $\phi_{iu}(\Delta \mathbf r)=\phi_{iu}(\mathbf r-\mathbf r_i),$ 6s/atom in LAMMPS, while the 2L models are slower but more accurate (Lysogorskiy et al., 25 Aug 2025).

Thermal conductivity is evaluated using the symmetric relative mean error $\phi_{iu}(\Delta \mathbf r)=\phi_{iu}(\mathbf r-\mathbf r_i),$ 7 on $\phi_{iu}(\Delta \mathbf r)=\phi_{iu}(\mathbf r-\mathbf r_i),$ 8 binary structures, with conductivities computed from predicted forces and analyzed using phono3py. The one- and two-layer GRACE models again form the Pareto front, and the best reported result is $\phi_{iu}(\Delta \mathbf r)=\phi_{iu}(\mathbf r-\mathbf r_i),$ 9 for GRACE-2L-OAM-L. The paper interprets this as significant because thermal conductivity depends on accurate higher-order derivatives and anharmonicity, so the benchmark probes more than static energy prediction (Lysogorskiy et al., 25 Aug 2025).

For specialized structure classes, the paper argues that GRACE generalizes beyond near-equilibrium bulk crystals. GRACE-2L-OAM-L achieves the lowest symmetric relative mean error overall among tested models for elastic constants. Grain boundaries and surfaces show typically moderate relative errors, but the paper notes systematic weaknesses for alkali metals such as K, Rb, and Cs, likely because a $\chi_{i\kappa}(\mu).$ 0 Å cutoff is too short for such large atoms. Point-defect formation energies for self-interstitials and vacancies show generally reasonable SRME values, often around $\chi_{i\kappa}(\mu).$ 1– $\chi_{i\kappa}(\mu).$ 2 (Lysogorskiy et al., 25 Aug 2025).

Long-time molecular-dynamics stability is demonstrated with a $\chi_{i\kappa}(\mu).$ 3 ns simulation of FLiBe containing about $\chi_{i\kappa}(\mu).$ 4 atoms at $\chi_{i\kappa}(\mu).$ 5 K using GRACE-2L-OMAT-L. The paper reports radial distribution functions and diffusion coefficients close to AIMD reference values, and treats this as evidence that the potentials remain stable over long trajectories rather than only under one-step evaluation metrics (Lysogorskiy et al., 25 Aug 2025).

The earlier 2023 GRACE paper provides a complementary empirical profile. On revised MD17, using $\chi_{i\kappa}(\mu).$ 6 training structures per molecule, GRACE achieved the best reported performance for all $\chi_{i\kappa}(\mu).$ 7 molecules against the compared models, and it remained best when trained with only $\chi_{i\kappa}(\mu).$ 8 structures. On the 3BPA extrapolation dataset, it outperformed the competing models on $\chi_{i\kappa}(\mu).$ 9 K, $\phi_{iu}(\mathbf r)=R_{nl}(r)Y_{lm}(\hat{\mathbf r}), \qquad u=nlm.$ 0 K, $\phi_{iu}(\mathbf r)=R_{nl}(r)Y_{lm}(\hat{\mathbf r}), \qquad u=nlm.$ 1 K, and dihedral-slice tests. On the general-purpose carbon dataset, GRACE achieved slightly lower total errors overall than MACE with about $\phi_{iu}(\mathbf r)=R_{nl}(r)Y_{lm}(\hat{\mathbf r}), \qquad u=nlm.$ 2k parameters versus about $\phi_{iu}(\mathbf r)=R_{nl}(r)Y_{lm}(\hat{\mathbf r}), \qquad u=nlm.$ 3M for MACE, and in the reported setup evaluation on an NVIDIA A100 GPU was about $\phi_{iu}(\mathbf r)=R_{nl}(r)Y_{lm}(\hat{\mathbf r}), \qquad u=nlm.$ 4s/atom for GRACE and $\phi_{iu}(\mathbf r)=R_{nl}(r)Y_{lm}(\hat{\mathbf r}), \qquad u=nlm.$ 5s/atom for MACE (Bochkarev et al., 2023).

Taken together, these results support two different but connected readings of GRACE. The 2023 paper emphasizes semilocal representation efficiency and parameter efficiency in smaller-scale or chemically focused settings, whereas the 2025 paper emphasizes universal coverage, Pareto-front accuracy versus efficiency, and broad transfer across materials tasks (Bochkarev et al., 2023, Lysogorskiy et al., 25 Aug 2025).

6. Adaptation, distillation, limitations, and adjacent developments

A major practical claim of the 2025 work is that GRACE functions as a foundation model rather than only as a fixed universal potential. In an Al-Li case study, the authors fine-tune one-layer and two-layer GRACE models on a curated dataset of roughly $\phi_{iu}(\mathbf r)=R_{nl}(r)Y_{lm}(\hat{\mathbf r}), \qquad u=nlm.$ 6 structures and compare against training-from-scratch baselines. The fine-tuned 2L model outperforms the training-from-scratch baselines, especially in low-data regimes, and the paper stresses that zero-shot GRACE predictions are already strong before specialization (Lysogorskiy et al., 25 Aug 2025).

The hydrogen-combustion experiment highlights a less favorable regime. There, the target fine-tuning data use very different electronic-structure settings from the pretraining data: Q-Chem with $\phi_{iu}(\mathbf r)=R_{nl}(r)Y_{lm}(\hat{\mathbf r}), \qquad u=nlm.$ 7B97X-V/cc-pVTZ rather than the PBE-based training sets. The paper reports that naive fine-tuning improves the new H2COMB task substantially but causes catastrophic forgetting on the original sAlex data, especially for H and O atoms. To reduce forgetting, the authors freeze parts of the model and train only selected ACE expansion coefficients. Two frozen-weight strategies are described: one trains only the final ACE expansion coefficients before atomic-energy readout, and another also trains the coefficients of messages passed between the first and second layers. Because these coefficients are element-dependent, parameters for unseen elements remain untouched. The resulting old-task and new-task errors trace out a Pareto front, illustrating the trade-off between adaptation and retention. The paper names LoRA and delta-tuning as promising future approaches (Lysogorskiy et al., 25 Aug 2025).

The distillation experiment addresses a different deployment problem: compressing broad chemical knowledge into a simpler and faster model. The authors fine-tune GRACE-2L-OMAT on HEA25 and HEA25S, use it to label a synthetic dataset composed of HEA structures plus unary and binary structures for all $\phi_{iu}(\mathbf r)=R_{nl}(r)Y_{lm}(\hat{\mathbf r}), \qquad u=nlm.$ 8 elements, and then train a simpler GRACE-FS student on the synthetic labels. Although the distilled student is less accurate than the teacher on the HEA25S validation set, it remains competitive and outperforms a baseline student trained directly on the original DFT-labeled HEA25+HEA25S data for unary and binary formation energies (Lysogorskiy et al., 25 Aug 2025).

Several limitations are stated directly. Some errors remain elevated for large alkali elements, suggesting that the fixed $\phi_{iu}(\mathbf r)=R_{nl}(r)Y_{lm}(\hat{\mathbf r}), \qquad u=nlm.$ 9 Å cutoff can be insufficient. Validation of universal potentials is still constrained by the lack of exhaustive benchmarks spanning all possible element combinations and simulation types. Fine-tuning on substantially different electronic-structure data can trigger catastrophic forgetting under naive parameter updates. The paper also notes that further details on the GRACE basis and architecture will be published separately (Lysogorskiy et al., 25 Aug 2025).

Within the broader ACE ecosystem, several adjacent works clarify GRACE’s place without being GRACE itself. “Atomic Cluster Expansion without Self-Interaction” analyzes canonical non-self-interacting cluster expansions and derives a sparse purification transform from self-interacting ACE features, presenting a theoretical precursor for graph-pure cluster constructions (Ho et al., 2024). “Charge-constrained Atomic Cluster Expansion” explicitly situates ACE as local or semilocal for graph ACE and extends the formalism to variational charge and descriptor degrees of freedom with long-range electrostatics (Rinaldi et al., 2024). “Atomic cluster expansion and wave function representations” generalizes ACE to anti-symmetric functions and shows how determinant, backflow, and related wave-function ansätze fit within an ACE design space; this suggests a route by which graph-based ACE ideas could be extended to fermionic electronic-structure representations (Drautz et al., 2022). These works do not define GRACE, but they delineate a surrounding theory space in which graph-based ACE serves as a unifying principle.

The overall scientific contribution attributed to GRACE in the 2025 paper is the unification of three elements often treated separately: a formally complete many-body basis, graph or message-passing evaluation, and universal multi-element coverage. This suggests a pretrained paradigm for atomistic simulation in which broadly trained GRACE potentials are adapted, specialized, or compressed rather than rebuilt from scratch for each chemistry or task (Lysogorskiy et al., 25 Aug 2025).