Papers
Topics
Authors
Recent
Search
2000 character limit reached

Most Reliable Independent Basis in ACE

Updated 2 June 2026
  • MRIB is an analytically derived, block-wise independent basis of rotation- and permutation-invariant cluster functions, providing a minimal yet robust foundation for ACE descriptors.
  • It leverages permutation-adapted arrangements and ladder recursions via generalized Wigner symbols to eliminate numerical SVD and enhance stability across high-degree clusters.
  • The block-wise selection algorithm reduces descriptor multicollinearity and computational cost, enabling efficient sparse regression in machine-learning interatomic potential models.

The Most Reliable Independent Basis (MRIB) is an analytically derived, block-wise independent, and conjecturally complete basis of rotation- and permutation-invariant (RPI) cluster functions for the Atomic Cluster Expansion (ACE). Its construction leverages permutation-adapted arrangements of cluster basis functions and recursion properties of generalized Wigner symbols. The MRIB provides a minimal and robust foundation for constructing interatomic potentials that maintain the symmetries and completeness required in modern atomistic modeling, addressing limitations of prior lexicographically ordered or numerically SVD-pruned bases (Goff et al., 2022).

1. Permutation-Adapted Rotation- and Permutation-Invariant Cluster Functions

The ACE framework describes atomic local environments in terms of basis functions that must be symmetrized with respect to both rotations and permutations. The MRIB is defined by systematically constructing RPI cluster functions as follows:

  • Single-bond basis functions:

ϕnlm(rij)=Rn(rij)Ylm(r^ij)\phi_{nlm}(r_{ij}) = R_n(r_{ij}) Y_l^m(\hat{r}_{ij})

Anlm(i)=ρi,ϕnlm=jϕnlm(rij)A_{nlm}(i) = \langle \rho_i, \phi_{nlm} \rangle = \sum_j \phi_{nlm}(r_{ij})

  • Un-symmetrized cluster products:

Φnlm(i)=κ=1NAnκlκmκ(i)\Phi_{\vec{n}\vec{l}\vec{m}}(i) = \prod_{\kappa=1}^N A_{n_\kappa l_\kappa m_\kappa}(i)

Φˉnlm=1N!σSNΦσ(nlm)\bar\Phi_{\vec{n}\vec{l}\vec{m}} = \frac{1}{\sqrt{N!}} \sum_{\sigma \in S_N} \Phi_{\sigma(\vec{n}\vec{l}\vec{m})}

Each block comprises all functions with the same (multi)set (n,l)(\vec{n},\vec{l}).

  • Rotation invariance via Wigner symbols:

BnlL(i)=mWlm(L;M)Φˉnlm(i)B_{\vec{n}\vec{l}\vec{L}}(i) = \sum_{\vec{m}} W_{\vec{l}}^{\vec{m}}(\vec{L};\vec{M}) \bar\Phi_{\vec{n}\vec{l}\vec{m}}(i)

where L\vec{L} are intermediate angular momenta satisfying generalized triangle (“polygon”) conditions and LR=0L_R = 0 yields full invariance.

These constructions were previously known to form highly over-complete sets, necessitating numerical SVD to identify a minimal basis.

2. Analytical Linear Relationships via Recursion and Symmetry

The MRIB methodology eliminates the need for numerical SVD by exploiting analytical linear relations among RPI functions, derived from permutation symmetries and generalized Wigner symbol recursions:

  • Permutation symmetries:

Binary-tree automorphism groups GNG_N—for example, G4G_4 for rank-4 clusters—induce sign-altered permutations of intermediate labels in Wigner-symbol coupled functions.

  • Ladder recursion relations:

Varying one intermediate angular momentum or projection in the Wigner-3j and higher-order coupling coefficients generates analytic raising/lowering relationships among cluster basis elements:

Anlm(i)=ρi,ϕnlm=jϕnlm(rij)A_{nlm}(i) = \langle \rho_i, \phi_{nlm} \rangle = \sum_j \phi_{nlm}(r_{ij})0

  • Explicit intra-block linearities:

For example, in rank 4 with Anlm(i)=ρi,ϕnlm=jϕnlm(rij)A_{nlm}(i) = \langle \rho_i, \phi_{nlm} \rangle = \sum_j \phi_{nlm}(r_{ij})1,

Anlm(i)=ρi,ϕnlm=jϕnlm(rij)A_{nlm}(i) = \langle \rho_i, \phi_{nlm} \rangle = \sum_j \phi_{nlm}(r_{ij})2

Generalization to higher rank exploits the same families of recursion to express any within-block Anlm(i)=ρi,ϕnlm=jϕnlm(rij)A_{nlm}(i) = \langle \rho_i, \phi_{nlm} \rangle = \sum_j \phi_{nlm}(r_{ij})3 as a linear combination of a selected independent subset.

3. Block-Wise Selection Algorithm for MRIB Construction

The MRIB selects, from each over-complete block, a set of RPI functions provably linearly independent via a block-wise construction algorithm:

Step 1: Block construction and permutation adaptation

  • For each Anlm(i)=ρi,ϕnlm=jϕnlm(rij)A_{nlm}(i) = \langle \rho_i, \phi_{nlm} \rangle = \sum_j \phi_{nlm}(r_{ij})4, compute frequency partition Anlm(i)=ρi,ϕnlm=jϕnlm(rij)A_{nlm}(i) = \langle \rho_i, \phi_{nlm} \rangle = \sum_j \phi_{nlm}(r_{ij})5
  • Form coupling-compatible partition Anlm(i)=ρi,ϕnlm=jϕnlm(rij)A_{nlm}(i) = \langle \rho_i, \phi_{nlm} \rangle = \sum_j \phi_{nlm}(r_{ij})6 for maximal automorphisms
  • Generate representative permutations Anlm(i)=ρi,ϕnlm=jϕnlm(rij)A_{nlm}(i) = \langle \rho_i, \phi_{nlm} \rangle = \sum_j \phi_{nlm}(r_{ij})7 and all distinguishable Anlm(i)=ρi,ϕnlm=jϕnlm(rij)A_{nlm}(i) = \langle \rho_i, \phi_{nlm} \rangle = \sum_j \phi_{nlm}(r_{ij})8 up to automorphisms

Step 2: Block-internal sampling

  • Derive “ladder-order” Anlm(i)=ρi,ϕnlm=jϕnlm(rij)A_{nlm}(i) = \langle \rho_i, \phi_{nlm} \rangle = \sum_j \phi_{nlm}(r_{ij})9 of over-complete block
  • Apply precomputed sampling (e.g., fixed stride or index sublist), guaranteed by analytic relations to select a maximal independent subset Φnlm(i)=κ=1NAnκlκmκ(i)\Phi_{\vec{n}\vec{l}\vec{m}}(i) = \prod_{\kappa=1}^N A_{n_\kappa l_\kappa m_\kappa}(i)0

Pseudocode (verbatim):

(n,l)(\vec{n},\vec{l})2 In practice, sampling patterns are precomputed per block type rather than tested at runtime.

4. Independence and Completeness: Analytical Results and Conjecture

Block-wise independence:

By construction, the selected subset in each block cannot be connected via a single Wigner recursion or permitted permutation; in explicit cases (e.g., rank-4 Φnlm(i)=κ=1NAnκlκmκ(i)\Phi_{\vec{n}\vec{l}\vec{m}}(i) = \prod_{\kappa=1}^N A_{n_\kappa l_\kappa m_\kappa}(i)1 or Φnlm(i)=κ=1NAnκlκmκ(i)\Phi_{\vec{n}\vec{l}\vec{m}}(i) = \prod_{\kappa=1}^N A_{n_\kappa l_\kappa m_\kappa}(i)2), dependence arises only between known pairings, and the selection removes exactly one from each dependent pair.

Completeness conjecture:

Total PA-selected function counts match the group-theoretic SO(3)Φnlm(i)=κ=1NAnκlκmκ(i)\Phi_{\vec{n}\vec{l}\vec{m}}(i) = \prod_{\kappa=1}^N A_{n_\kappa l_\kappa m_\kappa}(i)3SΦnlm(i)=κ=1NAnκlκmκ(i)\Phi_{\vec{n}\vec{l}\vec{m}}(i) = \prod_{\kappa=1}^N A_{n_\kappa l_\kappa m_\kappa}(i)4 representation enumeration:

Φnlm(i)=κ=1NAnκlκmκ(i)\Phi_{\vec{n}\vec{l}\vec{m}}(i) = \prod_{\kappa=1}^N A_{n_\kappa l_\kappa m_\kappa}(i)5

up to at least N=5,6 and degree Φnlm(i)=κ=1NAnκlκmκ(i)\Phi_{\vec{n}\vec{l}\vec{m}}(i) = \prod_{\kappa=1}^N A_{n_\kappa l_\kappa m_\kappa}(i)6, always agreeing with numerically SVD-pruned bases. Orthogonality of generalized Wigner symbols implies no “across-block” overlap, but a fully rigorous Gram–Schmidt orthonormality proof remains open.

5. Numerical Benchmarks and Descriptor Counts

Extensive numerical enumeration validates the MRIB’s independence and completeness. The block-wise analytic selection yields substantial descriptor count reductions:

Body order, Φnlm(i)=κ=1NAnκlκmκ(i)\Phi_{\vec{n}\vec{l}\vec{m}}(i) = \prod_{\kappa=1}^N A_{n_\kappa l_\kappa m_\kappa}(i)7 All permutations Lexicographic RPI PA-RPI (MRIB) Fraction retained
Rank 4, Φnlm(i)=κ=1NAnκlκmκ(i)\Phi_{\vec{n}\vec{l}\vec{m}}(i) = \prod_{\kappa=1}^N A_{n_\kappa l_\kappa m_\kappa}(i)8 Φnlm(i)=κ=1NAnκlκmκ(i)\Phi_{\vec{n}\vec{l}\vec{m}}(i) = \prod_{\kappa=1}^N A_{n_\kappa l_\kappa m_\kappa}(i)9 Φˉnlm=1N!σSNΦσ(nlm)\bar\Phi_{\vec{n}\vec{l}\vec{m}} = \frac{1}{\sqrt{N!}} \sum_{\sigma \in S_N} \Phi_{\sigma(\vec{n}\vec{l}\vec{m})}0 Φˉnlm=1N!σSNΦσ(nlm)\bar\Phi_{\vec{n}\vec{l}\vec{m}} = \frac{1}{\sqrt{N!}} \sum_{\sigma \in S_N} \Phi_{\sigma(\vec{n}\vec{l}\vec{m})}1 Φˉnlm=1N!σSNΦσ(nlm)\bar\Phi_{\vec{n}\vec{l}\vec{m}} = \frac{1}{\sqrt{N!}} \sum_{\sigma \in S_N} \Phi_{\sigma(\vec{n}\vec{l}\vec{m})}2
Rank 5, Φˉnlm=1N!σSNΦσ(nlm)\bar\Phi_{\vec{n}\vec{l}\vec{m}} = \frac{1}{\sqrt{N!}} \sum_{\sigma \in S_N} \Phi_{\sigma(\vec{n}\vec{l}\vec{m})}3 Φˉnlm=1N!σSNΦσ(nlm)\bar\Phi_{\vec{n}\vec{l}\vec{m}} = \frac{1}{\sqrt{N!}} \sum_{\sigma \in S_N} \Phi_{\sigma(\vec{n}\vec{l}\vec{m})}4 Φˉnlm=1N!σSNΦσ(nlm)\bar\Phi_{\vec{n}\vec{l}\vec{m}} = \frac{1}{\sqrt{N!}} \sum_{\sigma \in S_N} \Phi_{\sigma(\vec{n}\vec{l}\vec{m})}5 Φˉnlm=1N!σSNΦσ(nlm)\bar\Phi_{\vec{n}\vec{l}\vec{m}} = \frac{1}{\sqrt{N!}} \sum_{\sigma \in S_N} \Phi_{\sigma(\vec{n}\vec{l}\vec{m})}6 Φˉnlm=1N!σSNΦσ(nlm)\bar\Phi_{\vec{n}\vec{l}\vec{m}} = \frac{1}{\sqrt{N!}} \sum_{\sigma \in S_N} \Phi_{\sigma(\vec{n}\vec{l}\vec{m})}7

Per-block lexicographic overshoots of up to 40% are rectified by PA sampling, with MRIB matching SVD-pruned cardinality in every tested instance [Table II–VI, (Goff et al., 2022)].

A practical case is the construction of a linear ACE interatomic potential for tantalum using FitSNAP and Bayesian compressive sensing. In this application, even under strong Φˉnlm=1N!σSNΦσ(nlm)\bar\Phi_{\vec{n}\vec{l}\vec{m}} = \frac{1}{\sqrt{N!}} \sum_{\sigma \in S_N} \Phi_{\sigma(\vec{n}\vec{l}\vec{m})}8 (density) regularization, identifiable high-degree cluster descriptors from the MRIB basis persist and are critical for achieving optimal energy and force accuracy. This demonstrates that physically meaningful high-order clusters are retained under the MRIB construction.

6. MRIB: Advantages, Trade-Offs, and Implications for Machine-Learning Potentials

Advantages:

  • Analytical block-wise independence eliminates small-singular-value pathologies typical of SVD
  • No need for large-order SVD computations on ill-conditioned overlap matrices, ensuring stability at high degree or body order
  • Basis is strictly minimal: any further removal breaks rotational and permutation invariance or completeness

Trade-offs compared to SVD-based bases:

  • Fully reproducible, transparent analytical selection
  • Avoids Φˉnlm=1N!σSNΦσ(nlm)\bar\Phi_{\vec{n}\vec{l}\vec{m}} = \frac{1}{\sqrt{N!}} \sum_{\sigma \in S_N} \Phi_{\sigma(\vec{n}\vec{l}\vec{m})}9 cost of SVD
  • Upfront derivation of permutation and recursion needed
  • For large (n,l)(\vec{n},\vec{l})0, generation of Wigner symbols and automorphisms required; these scale as (n,l)(\vec{n},\vec{l})1 per block

Implications for sparse regression:

MRIB’s minimal basis alleviates descriptor multicollinearity, facilitating sparser and more stable regression solutions. Bayesian compressive sensing can efficiently identify the most informative MRIB descriptors, and critically, high-degree components often survive regularization, contributing meaningfully to model accuracy.

In summary, the MRIB—equivalent to the PA-RPI analytic basis—provides a theoretically grounded, fully analytic, and block-wise independent foundation for ACE descriptor sets. Empirical and theoretical consistency with SVD-pruned approaches, but with improved stability and reproducibility especially in high-rank/high-degree regimes, make it a robust platform for constructing machine-learned interatomic potentials (Goff et al., 2022).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Most Reliable Independent Basis (MRIB).