Graph KAN (GKAN): Group-Invariant Neural Architecture

Updated 7 December 2025

Graph KAN (GKAN) is a neural network architecture that extends Kolmogorov–Arnold Networks by incorporating geometric invariances, such as Euclidean and permutation symmetries, into function modeling.
It employs univariate basis transformations on invariant scalar features to ensure exact or efficient representations while reducing the network complexity compared to traditional approaches.
Empirical results show that GKAN achieves state-of-the-art performance in modeling physical and molecular systems, offering improved parameter efficiency and training stability.

A Graph KAN (GKAN) is the geometric and group-invariant extension of Kolmogorov–Arnold Networks (KANs), originally developed to implement the Kolmogorov–Arnold superposition theorem in neural architectures. GKANs generalize the KAN paradigm to data with nontrivial geometric symmetries (such as Euclidean, orthogonal, or permutation invariances), enabling exact or highly-efficient modeling of functions on spaces equipped with group actions relevant to natural sciences and engineering. As such, GKAN constitutes a unifying architecture that bridges the gap between sharp theoretical representation results and practical, symmetry-aware machine learning systems (Alesiani et al., 23 Feb 2025).

1. Mathematical Foundations: Kolmogorov–Arnold Superposition Theorem and KANs

The Kolmogorov–Arnold Theorem (KAT), or Kolmogorov Superposition Theorem (KST), states that any continuous multivariate function $f:[0,1]^m\to\mathbb R$ can be written exactly as a finite sum of compositions of continuous univariate functions and addition: $f(x_1,\ldots,x_m) = \sum_{q=1}^{2m+1} \psi_q\left( \sum_{p=1}^{m} \phi_{q,p}(x_p) \right)$ where $\phi_{q,p}$ (the "inner" functions) and $\psi_q$ (the "outer" functions) are continuous and univariate (Alesiani et al., 23 Feb 2025). This representation is theoretically exact and, critically, dimension-free: the number of terms scales linearly with $m$ .

Kolmogorov–Arnold Networks (KANs) instantiate this decomposition as a neural network architecture, with two stages at each layer: (i) the application of univariate basis transformations to each coordinate, and (ii) an outer aggregation via another set of univariate nonlinearities. In standard practice, univariate maps are implemented via B-spline expansions, giving KANs explicit constructive and universal approximation properties (Basina et al., 15 Nov 2024).

2. Motivation for Geometric Extensions: Invariance and Equivariance

In many scientific applications—molecular dynamics, particle physics, geometric deep learning—the target functions to be modeled are not merely arbitrary functions on $\mathbb R^{m \times n}$ , but obey inherent geometric invariance or equivariance. That is, these functions must remain unchanged (invariant) or transform covariantly (equivariant) under group actions such as

$O(n)$ : rotations and reflections,
$S_m$ : permutations of inputs (e.g., atoms, nodes),
General linear group $GL(n)$ or Lorentz group $O(1,n)$ .

Classical KANs, defined on raw coordinate inputs, cannot natively enforce such symmetry constraints and may require extensive data and parameterization to approximate geometric invariants. This limits their applicability in physical systems where $E(3)$ -invariance (3D Euclidean group) or permutation symmetry is essential for generalization and stability (Alesiani et al., 23 Feb 2025).

3. Geometric GKAN Construction: Exact Group-Invariant Superposition

GKANs generalize the superposition framework to functions with group invariance. For $O(n)$ -invariant functions $f: (\mathbb R^{n})^m \to \mathbb R$ , the geometric KST asserts the existence of an exact expansion: $f(\mathbf{x}_1, \ldots, \mathbf{x}_m) = \sum_{q=1}^{2m^2+1} \psi_q\left( \sum_{i=1}^m \sum_{j=1}^m \phi_{q,i,j}\left( \langle \mathbf{x}_i, \mathbf{x}_j \rangle \right) \right)$ where $\langle \mathbf{x}_i, \mathbf{x}_j \rangle$ are $O(n)$ -invariant inner products, and all $\phi_{q,i,j}, \psi_q$ are univariate maps (Alesiani et al., 23 Feb 2025). This covers the full algebra of scalar invariants generated by inputs. For $O(n)$ -equivariant maps $F: (\mathbb R^{n})^m \to \mathbb R^n$ , the expansion takes the form: $F(\mathbf{x}_1, \ldots, \mathbf{x}_m) = \sum_{k=1}^{m} \left( \sum_q \psi^k_q(\text{invariants}) \right) \mathbf{x}_k$ where each coefficient is an invariant scalar function of inner products. Thus, the GKAN framework systematically reduces the high-dimensional, symmetry-constrained function approximation problem to one over scalar features and univariate nonlinearities.

Equivariance and invariance under more general groups (e.g., $GL$ , Lorentz) are achieved by forming suitable scalar invariants (contracted tensor products, norms, cross-products) that are respected by the group action, and then applying the same univariate superposition machinery to those invariants.

4. Layer Structure of GKANs and Training Protocols

A GKAN layer is built as follows: for input vectors $(\mathbf{z}_1,\ldots,\mathbf{z}_m)\in (\mathbb R^n)^m$ , the layer computes all required geometric invariants (e.g., $\langle \mathbf{z}_i, \mathbf{z}_j \rangle$ ) and processes them through an inner collection of univariate functions $\{\phi_{j,k}\}$ to generate intermediate features. These are aggregated (summed) and then mapped by outer univariate functions $\{\psi_{i,k}\}$ to form final outputs. Optionally, a learnable linear (or residual) path using neural nonlinearities (e.g., ReLU) can be added to enhance representational flexibility: $\mathbf{z}^{(\ell+1)} = \Psi^{(\ell)}\left( \Phi^{(\ell)\,T}(\mathbf{z}^{(\ell)}) \right) + W^{(\ell)}_\psi \sigma\left(W^{(\ell)\,T}_\phi \mathbf{z}^{(\ell)}\right)$ where all univariate maps and residual weights are trained by gradient descent on empirical loss (e.g., MSE, Huber, or joint energy/force losses) (Alesiani et al., 23 Feb 2025).

Crucially, no coordinate-dependent (non-invariant) parameters are present in the invariant branch; all learnable content is in the univariate mappings on scalar invariants. Equivariance for vector outputs is enforced by combining input vectors $\mathbf{x}_k$ with invariant scalar weightings as above.

Permutation invariance, necessary in many graph-structured or multi-particle systems, is implemented by summing or averaging over sets of symmetric features (e.g., all unordered pairs).

5. Empirical Results: Modeling Physical and Molecular Systems

GKANs have been validated against classical KANs and multilayer perceptrons (MLPs) on a range of benchmarks requiring geometric symmetry:

Lennard–Jones (LJ) systems: GKANs achieve higher negative log Huber accuracy than coordinate-based KANs or MLPs, especially as the number of particles and dimension increase. Permutation-invariant GKAN drastically reduces parameter count (from ~1.2M to 4k) with only modest loss of accuracy, a regime where MLPs fail to generalize.
Molecular dynamics datasets (MD17, MD22): Across several small molecules and large biomolecular systems, GKANs consistently obtain state-of-the-art negative log loss, outperforming both KAN and MLP baselines. Notably, GKAN yields smoother, more stable training and generalizes to unseen data where standard architectures overfit or collapse (Alesiani et al., 23 Feb 2025).

A summary table from (Alesiani et al., 23 Feb 2025) illustrates key performance figures:

Task/System	$O(n)$ KAN	$O(n)$ MLP	Permutation $+O(n)$ KAN	Permutation $+O(n)$ MLP
LJ (15 pt., 3D)	7.28±1.17	7.09±1.10	7.28±1.17	3.92±0.41
MD17 (Aspirin)	6.44±0.10	5.62±0.01	5.69±0.02	4.73±0.27
MD22 (agg.)	7.5–9.0	5.5–7.0	5.5–7.5	0.0–1.5

Geometric KANs retain high accuracy even at orders of magnitude smaller model size.

6. Theoretical and Practical Significance

The geometric KAN construction constitutes a mathematically exact and computationally efficient method to enforce desired symmetry constraints in learned functions, based on best-possible superposition decompositions over invariant features.

Eliminates the curse of dimensionality: Owing to the exact linear scaling of the number of outer terms, no hidden exponential scaling appears in required network size for a target error, even in high dimension (Basina et al., 15 Nov 2024).
Enforces group symmetry by construction: No data augmentation or post-hoc symmetrization is needed; invariance/equivariance is architecturally guaranteed.
Parameter efficiency and stability: Empirically, GKAN achieves lower error with much fewer parameters and significantly more stable training compared to coordinate-based (non-invariant) neural architectures.
Applicability to physical modeling: The architecture is especially well-suited to molecular, dynamical, or particle systems where energy, force, and other observables must be invariant or equivariant under physical symmetries.

A plausible implication is that GKAN provides an optimal business case for high-fidelity surrogate modeling in computational chemistry, statistical mechanics, and geometric deep learning when group-invariant function modeling is essential.

7. Extensions, Open Problems, and Outlook

While GKANs capture a broad class of group-invariant and group-equivariant function spaces, open directions include:

Systematic extension to other groups, manifolds, and higher-order tensors.
Optimization of the inner and outer univariate map parameterizations (e.g., B-splines vs. neural rational function bases).
Integration with compositional kernel and attention mechanisms, unifying GKAN with kernelized transformer modules (Liu et al., 29 Mar 2025).
Investigation of learning-theoretic sample efficiency and generalization bounds for GKANs relative to standard architectures.

The geometric Kolmogorov–Arnold superposition framework, and its implementation in GKANs, unify exact functional representation theory with practical group-aware neural architecture design, enabling state-of-the-art results and theoretical guarantees in diverse symmetry-laden application domains (Alesiani et al., 23 Feb 2025, Basina et al., 15 Nov 2024).