Papers
Topics
Authors
Recent
Search
2000 character limit reached

Symmetry-Adapted KANs

Updated 18 June 2026
  • Symmetry-Adapted KANs are machine learning frameworks that rigorously encode group symmetries (e.g., rotations, Lorentz transformations, permutations) to ensure equivariance throughout neural architectures.
  • They leverage gated spline nonlinearities, SVD-projected equivariant linear layers, and lift operations to systematically enforce symmetry constraints at every stage of the model.
  • Applications span particle physics, quantum chemistry, and materials science, achieving superior prediction accuracy and parameter efficiency with symmetry-aware design.

Symmetry-Adapted Kolmogorov–Arnold Networks (KANs) are a class of machine learning frameworks designed to respect and efficiently encode arbitrary group symmetries—such as rotations, Lorentz transformations, and permutations—within neural architectures. In particular, recent developments have focused on two strands: (i) the extension of spline-based Kolmogorov–Arnold Networks to incorporate equivariance under arbitrary matrix groups, yielding so-called Equivariant KANs (EKANs) (Hu et al., 2024), and (ii) the use of symmetry-adapted density-correlation features for atomistic and quantum chemical property prediction, where the model outputs themselves are matrices or tensors with defined transformation rules (Nigam et al., 2021). These approaches systematically exploit symmetry priors to maximize data efficiency, generalization, and physical faithfulness in scientific machine learning.

1. Mathematical Construction of Equivariant Kolmogorov–Arnold Networks

EKANs generalize conventional KANs by ensuring each stage of the network is equivariant under a user-specified matrix group GG. The central architectural motif comprises three components:

  • Gated spline nonlinearities: Pre-activation vectors vgiv_{gi} are partitioned as (a=1cisi,a)(a=1Aivi,a)(a=1Aisi,a)(\oplus_{a=1}^{c_i} s_{i,a}) \oplus (\oplus_{a=1}^{A_i} v_{i,a}) \oplus (\oplus_{a=1}^{A_i} s'_{i,a}), where si,as_{i,a} are scalars, vi,aT(pi,a,qi,a)v_{i,a} \in T(p_{i,a},q_{i,a}) are tensors transforming under irreducible representations ρi,a(g)\rho_{i,a}(g), and si,as'_{i,a} are gate scalars for each non-scalar channel. The post-activation is

vm,b=[a=1cisi,aBb(si,a)][a=1Aivi,aBb(si,a)]v_{m,b} = \bigl[\oplus_{a=1}^{c_i} s_{i,a} B_b(s_{i,a})\bigr] \oplus \bigl[\oplus_{a=1}^{A_i} v_{i,a} B_b(s'_{i,a})\bigr]

for each spline basis BbB_b; a final channel applies the SiLU function. This construction ensures that the nonlinearity commutes with the group action (Theorem 3.1), i.e.,

gG, ρm(g)f(vgi)=f(ρgi(g)vgi).\forall g\in G,~ \rho_m(g) f(v_{gi}) = f(\rho_{gi}(g) v_{gi}).

  • Equivariant linear layers: The weights vgiv_{gi}0 must satisfy for all vgiv_{gi}1:

vgiv_{gi}2

This reduces to a linear nullspace constraint on vgiv_{gi}3 via a constraint matrix vgiv_{gi}4 formed from infinitesimal and discrete generators of vgiv_{gi}5. The nullspace is computed via SVD, and random weights are projected onto this subspace, exactly enforcing equivariance (Hu et al., 2024).

  • Lift layers: The initial feature space vgiv_{gi}6 is mapped to the input space vgiv_{gi}7 of the first EKAN layer by an equivariant linear "lift". Each tensor block in vgiv_{gi}8 results, via this lift map, in an associated gate scalar, so the total map maintains group equivariance by constraint.

The composition of lift, gated nonlinear, and equivariant linear layers guarantees that the entire architecture satisfies

vgiv_{gi}9

2. Symmetry-Adapted N-center Density-Correlation Features for Atomistic Properties

In molecular and materials modeling, predicting quantities (e.g., Hamiltonian matrix elements) that transform nontrivially under rotations, translations, and permutations requires symmetry-adapted features. The so-called KAN features of (Nigam et al., 2021) provide an explicit recipe:

  • Raw feature construction: For a structure with atomic positions (a=1cisi,a)(a=1Aivi,a)(a=1Aisi,a)(\oplus_{a=1}^{c_i} s_{i,a}) \oplus (\oplus_{a=1}^{A_i} v_{i,a}) \oplus (\oplus_{a=1}^{A_i} s'_{i,a})0, the (a=1cisi,a)(a=1Aivi,a)(a=1Aisi,a)(\oplus_{a=1}^{c_i} s_{i,a}) \oplus (\oplus_{a=1}^{A_i} v_{i,a}) \oplus (\oplus_{a=1}^{A_i} s'_{i,a})1-center feature is

(a=1cisi,a)(a=1Aivi,a)(a=1Aisi,a)(\oplus_{a=1}^{c_i} s_{i,a}) \oplus (\oplus_{a=1}^{A_i} v_{i,a}) \oplus (\oplus_{a=1}^{A_i} s'_{i,a})2

where (a=1cisi,a)(a=1Aivi,a)(a=1Aisi,a)(\oplus_{a=1}^{c_i} s_{i,a}) \oplus (\oplus_{a=1}^{A_i} v_{i,a}) \oplus (\oplus_{a=1}^{A_i} s'_{i,a})3 is a Gaussian of width (a=1cisi,a)(a=1Aivi,a)(a=1Aisi,a)(\oplus_{a=1}^{c_i} s_{i,a}) \oplus (\oplus_{a=1}^{A_i} v_{i,a}) \oplus (\oplus_{a=1}^{A_i} s'_{i,a})4.

  • Translation invariance: Achieved by integrating over global translation, leading to features expressed relative to a central atom.
  • O(3) symmetrization: Features are averaged over (a=1cisi,a)(a=1Aivi,a)(a=1Aisi,a)(\oplus_{a=1}^{c_i} s_{i,a}) \oplus (\oplus_{a=1}^{A_i} v_{i,a}) \oplus (\oplus_{a=1}^{A_i} s'_{i,a})5 and expanded in real spherical harmonics and radial basis functions, yielding tensors with explicit angular-momentum labels:

(a=1cisi,a)(a=1Aivi,a)(a=1Aisi,a)(\oplus_{a=1}^{c_i} s_{i,a}) \oplus (\oplus_{a=1}^{A_i} v_{i,a}) \oplus (\oplus_{a=1}^{A_i} s'_{i,a})6

  • Permutation symmetrization: Index symmetries are imposed explicitly, symmetrizing or antisymmetrizing as relevant for the physical property.
  • Learning matrix-valued quantum properties: The atomic-orbital Hamiltonian (a=1cisi,a)(a=1Aivi,a)(a=1Aisi,a)(\oplus_{a=1}^{c_i} s_{i,a}) \oplus (\oplus_{a=1}^{A_i} v_{i,a}) \oplus (\oplus_{a=1}^{A_i} s'_{i,a})7 is decomposed into symmetry-adapted blocks by coupling spherical harmonic indices into irreducible (a=1cisi,a)(a=1Aivi,a)(a=1Aisi,a)(\oplus_{a=1}^{c_i} s_{i,a}) \oplus (\oplus_{a=1}^{A_i} v_{i,a}) \oplus (\oplus_{a=1}^{A_i} s'_{i,a})8 multiplets. Linear models (ridge regression) or symmetry-adapted Gaussian process regression (SA-GPR) are trained on the resulting feature blocks, block-diagonalized by symmetry.

3. End-to-End Architecture and Workflow

Symmetry-Adapted KANs present an end-to-end flow:

  1. Data preprocessing: Input raw features are mapped into a group-representation space ((a=1cisi,a)(a=1Aivi,a)(a=1Aisi,a)(\oplus_{a=1}^{c_i} s_{i,a}) \oplus (\oplus_{a=1}^{A_i} v_{i,a}) \oplus (\oplus_{a=1}^{A_i} s'_{i,a})9) reflecting the symmetry content of the problem.
  2. Lift operation: Data is mapped to si,as_{i,a}0, introducing appropriate gate scalars per tensorial component for the correct nonlinear gating structure.
  3. Stack of equivariant layers: Each layer alternates a gated spline equivariant nonlinearity with an SVD-projected equivariant linear map, ensuring exact group equivariance at every stage.
  4. Final projection: After multiple layers, the final output contains both physical feature channels and gate scalars, from which the latter are dropped to yield the final, symmetry-adapted prediction.

For atomistic-property prediction, feature construction, contraction (e.g., via PCA or the NICE contraction), and block-diagonalization by irreducible labels are key elements (Nigam et al., 2021).

4. Empirical Performance and Benchmarks

EKANs have demonstrated substantial gains in sample complexity and parameter efficiency across physics-inspired tasks (Hu et al., 2024). Typical results include:

Task Best EKAN Test MSE or Accuracy Baseline Models Compared Relative Parameter Efficiency
Particle scattering O(1,3) si,as_{i,a}1 MLP, KAN, EMLP-O(1,3) Outperforms EMLP, MLP by 1-3 orders
Three-body O(2)-equivariance si,as_{i,a}2 MLP, KAN, EMLP-SO(2) Beats others at si,as_{i,a}3 params
Top-quark tagging O(1,3) 76.93% MLP, KAN, EMLP-O(1,3) 26% of EMLP param count

EKAN achieves the lowest test MSEs in Lorentz-invariant scattering and O(2)-equivariant three-body problems with drastically fewer parameters versus existing architectures. In top-quark tagging, EKAN attains comparable or superior accuracy while using only a fraction of the parameters. Notably, vanilla KANs (without symmetry adaptation) fail to outperform MLPs in such equivariant tasks, confirming the necessity of symmetry enforcement in these domains (Hu et al., 2024).

For N-center KAN features, benchmarks include water molecule and ethanol trajectory datasets (full-matrix RMSE si,as_{i,a}4 meV, eigenvalue RMSE si,as_{i,a}5 meV in small training regimes), and the QM7b-CHNO chemical space (eigenvalue MAE si,as_{i,a}6–0.3 eV after kernel learning and PCA reduction) (Nigam et al., 2021).

5. Practical Implementation and Guidelines

Key design and implementation principles for Symmetry-Adapted KANs:

  • Group specification: The user defines the relevant matrix group si,as_{i,a}7 (e.g., O(3), SO(2), O(1,3)), the associated irreducible representations for each feature channel, and accompanying generators for constraint enforcement.
  • Spline and basis function selection: Spline activations are used as nonlinearities, with the number of basis functions (si,as_{i,a}8) set to control expressivity.
  • Constraint solution: The SVD of the constraint matrix si,as_{i,a}9 yields a basis for all allowed equivariant linear maps; all weights are projected onto this space to preserve equivariance exactly.
  • Computational optimizations: For N-center density-correlation features, the "density trick" exploits factorization to reduce computational cost from vi,aT(pi,a,qi,a)v_{i,a} \in T(p_{i,a},q_{i,a})0 to vi,aT(pi,a,qi,a)v_{i,a} \in T(p_{i,a},q_{i,a})1 per environment.
  • Feature reduction: After generating full sets of high-dimensional equivariant features, iterative PCA or NICE contraction is used to retain only the leading components per symmetry channel, trading off memory footprint against information retention (Nigam et al., 2021).
  • Validation: Equivariance is validated by applying group actions (rotations, permutations) to both input and output, confirming invariance or equivariant transformations up to numerical precision.

6. Connections and Applications

Symmetry-Adapted KANs integrate the expressive capabilities of spline-based KANs with the systematic equivariant linear weight construction of EMLP frameworks. Applications include:

  • Physics regression tasks: Modeling Lorentz-invariant particle scattering, O(2)-equivariant three-body problems, and hadronic jet classification.
  • Quantum chemistry: Prediction of matrix-valued Hamiltonians, eigenvalues, and spectra in atom-centered orbital bases, requiring relationships under O(3) and permutation symmetries (Nigam et al., 2021).
  • Materials science and condensed matter: Construction of symmetry-adapted descriptors for tensorial observables, such as Hamiltonian blocks, two-center integrals, and J-couplings.

The exact preservation of physical symmetries results in physically consistent predictions, improved sample efficiency, and often dramatic reductions in required parameter count to reach baseline or superior accuracy.

7. Significance and Outlook

The unification of universal function approximation (via KANs) with exact group equivariance (via gating and SVD-projected linears) marks a significant advance for scientific machine learning. In regimes where high-fidelity, symmetry-respecting regression or classification is critical, such as high-energy physics, quantum chemistry, and molecular modeling, these architectures afford robust, interpretable solutions. Empirical evidence indicates that explicit symmetry adaptation is essential for leveraging the full modeling power of spline-based networks in structured scientific tasks (Hu et al., 2024, Nigam et al., 2021). A plausible implication is that further generalization to larger groups, or to higher-order tensor-valued targets (as in higher-body interactions), will benefit from the modular, representation-theoretic formalism of the Symmetry-Adapted KAN framework.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (2)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Symmetry-Adapted KANs.