Tensor Cluster Expansion (TCE)

Updated 8 September 2025

Tensor Cluster Expansion (TCE) is a formalism that uses explicit tensor contractions to model interactions and correlations in complex systems.
It enables efficient and scalable energy calculations in multicomponent materials by leveraging precomputed topology tensors and GPU-accelerated sparse-dense operations.
TCE overcomes traditional cluster expansion bottlenecks with rapid local energy updates and adaptable, parallelizable computation for large-scale simulations.

Tensor Cluster Expansion (TCE) is a computational and theoretical formalism that expresses interaction energies, correlation functions, and other observables of multicomponent statistical and quantum systems as explicit tensor contractions. Leveraging precomputed topology tensors and configuration encodings, TCE generalizes and subsumes standard cluster expansion (CE) techniques, enabling direct, scalable evaluation of energies and correlation functions, particularly in solids and many-body models. The central unifying feature is the mapping of lattice or graph-topological data into mixed dense-sparse tensor operations, making the formalism inherently suited for high-throughput, parallel computing environments and eliminating the enumerative bottlenecks of traditional CE approaches.

1. Mathematical Foundations and Formulation

TCE reformulates the effective Hamiltonian and correlation functions as series of tensor contractions over configuration and topology tensors, removing explicit iteration over cluster types. Each lattice site $i$ and chemical species $\alpha$ is mapped to a configuration tensor $X_{i\alpha}$ , where $X_{i\alpha} = 1$ if site $i$ is occupied by species $\alpha$ , 0 otherwise. Lattice topology is encoded via tensors such as

$A_{ij}^{(n)}$ for two-body ( $n$ -th neighbor pair) interactions
$B_{ijk}^{(n)}$ for three-body ( $n$ -th order triplet) interactions

Correlation functions for two- and three-body clusters are defined by tensor contractions: $N_{\alpha\beta}^{(n)} = \sum_{ij} A_{ij}^{(n)} X_{i\alpha} X_{j\beta}$

$M_{\alpha\beta\gamma}^{(n)} = \sum_{ijk} B_{ijk}^{(n)} X_{i\alpha} X_{j\beta} X_{k\gamma}$

The effective Hamiltonian is expressed as: $\mathcal{H}_{\text{eff}}(X) = \frac{1}{2!} \sum_{n,\alpha\beta} \epsilon_{\alpha\beta}^{(n)} N_{\alpha\beta}^{(n)} + \frac{1}{3!} \sum_{n,\alpha\beta\gamma} \zeta_{\alpha\beta\gamma}^{(n)} M_{\alpha\beta\gamma}^{(n)} + \ldots$ Here, $\epsilon_{\alpha\beta}^{(n)}$ and $\zeta_{\alpha\beta\gamma}^{(n)}$ are learnable cluster interaction parameters. TCE generalizes readily to higher-order clusters and tensorial observables via further contraction over corresponding topology tensors.

2. Implementation in Computational Practice

In tce-lib (Jeffries et al., 4 Sep 2025), TCE is instantiated by precomputing all required topology tensors per lattice and encoding the configuration as a one-hot tensor $X$ . The feature vector $t(X)$ for a given configuration is assembled via sparse–dense tensor contractions, efficiently using libraries such as sparse and opt-einsum for GPU acceleration. The model training involves fitting the interaction parameter vector $j$ in the linear mapping $\mathcal{H}_{\text{eff}}(X) = j \cdot t(X)$ to reference data (e.g., DFT energies).

For energy differences between two configurations $X$ and $X'$ , TCE leverages the incremental structure: $\mathcal{H}_{\text{eff}}(X') - \mathcal{H}_{\text{eff}}(X) = j \cdot ( t(X') - t(X) )$ Further optimization allows nearly $\mathcal{O}(1)$ time updates by restricting contraction to "changed" sites only, crucial for large-scale Monte Carlo simulations and rapid local relaxation.

3. Advantages over Traditional Cluster Expansion

TCE circumvents essential limitations of the standard CE, offering:

Universal applicability: No requirement to enumerate symmetrically equivalent cluster types per lattice, directly accommodating arbitrary, low-symmetry, or exotic lattices.
Scalability: Tensor contractions allow regular, high-throughput computation, making the approach suitable for massive parallelization on GPUs/TPUs.
Local energy updates: The structure allows fast incremental calculations for local configuration changes, essential for efficient sampling and molecular dynamics.
Explicit topology exposure: The physical (lattice) topology is directly encoded in the contraction tensors, offering transparent extension to higher-order interactions and tensorial fields.

4. Applications to Multicomponent Materials and Alloys

TCE has been applied to a variety of materials modeling problems:

Binary Alloys TaW: In Ta $_{1-x}$ W $_x$ (bcc), TCE models were fit to DFT data via genetic algorithm sampling, allowing efficient computation of enthalpy of mixing curves:

$\Delta H_{\text{mix}}(x) = E(x) - x E_{\text{W}} - (1-x) E_{\text{Ta}}$

The predicted curves and locations of minimum enthalpy (near 60–70% W) align closely with DFT (Jeffries et al., 4 Sep 2025).

High-Entropy Alloy CoNiCrFeMn: For the equiatomic fcc system, surrogate models and Monte Carlo/MD data were used for training. TCE yields Cowley SRO parameters that reproduce the chemical segregation trends (e.g., strong negative Cr–Cr SRO) consistent with ground truth MEAM-driven MC/MD simulations.

System	Property	TCE Prediction	Ground Truth
TaW (bcc)	$\Delta H_{\text{mix}}$	Accurate min. at ~60–70% W	DFT minimum at same composition
CoNiCrFeMn (fcc)	Cowley SRO	Strongly negative for Cr–Cr	MC/MD + MEAM show same segregation trend

5. Mathematical Generalizations and Future Directions

TCE formalism is compatible with the generalizations in contemporary cluster expansion theory, including:

Linked-cluster expansions for correlated wavefunctions (Yamada, 2018, Myo et al., 2022)
Axiomatic Hilbert space-based cluster definitions via Möbius inversion (Lammert et al., 2022)
Graph-based tensor decompositions for semilocal and message-passing models (Bochkarev et al., 2023)

This suggests high potential for TCE as an adaptable backbone for machine learning potentials, tensor network-based expansions in solid state and quantum chemistry, and general many-body simulation frameworks. A plausible implication is that TCE approaches will further facilitate extensions to higher-order physical properties (e.g., stress tensors), enable real-time dynamical simulation via rapid energy updates, and integrate naturally with tensor network representations.

6. Comparison to Ground Truth and Validity

All benchmarked applications in multicomponent alloys demonstrate that TCE models—when trained on reliable reference data—yield predictions in excellent quantitative agreement with ground truth methods:

DFT calculated mixing energies and SRO parameters are reproduced with high fidelity.
Local energy updates via TCE maintain accuracy while dramatically enhancing computational throughput.

This comprehensive match underscores TCE's reliability and robustness as a formal and practical generalization of cluster expansion methods, particularly when implemented on modern parallel hardware platforms (Jeffries et al., 4 Sep 2025).

7. Limitations and Scope

TCE's performance is contingent on the accuracy of the fitted interaction parameters and the completeness of the topology tensors. For systems dominated by long-range or multi-body interactions, careful inclusion of higher-order tensors is required. While $\mathcal{O}(1)$ energy difference calculations are possible for local changes, the computational cost of contracting very large topology tensors may increase for systems with extensive multi-body correlations. The formalism is tied to the representation quality of the configuration encoding and topology tensors, and its generalization to systems with non-trivial tensorial observables (e.g., elasticity, polarization) requires appropriate tensor contraction mappings.

In summary, Tensor Cluster Expansion provides an efficient, generalized, adaptable, and directly implementable framework for modeling energies and correlations in multicomponent solids, subsuming, extending, and outperforming traditional cluster expansion methods, with demonstrable accuracy against high-level reference data and scalable computational performance (Jeffries et al., 4 Sep 2025).