Papers
Topics
Authors
Recent
2000 character limit reached

Kolmogorov-Arnold Graph Neural Networks

Updated 29 December 2025
  • KAGNNs are advanced graph neural network architectures that use learnable univariate non-linear maps based on the Kolmogorov–Arnold theorem to boost expressivity and interpretability.
  • They replace traditional linear weights with spline, polynomial, or Fourier-based functions, enabling universal function approximation and robust performance across tasks.
  • Empirical results show that KAGNNs outperform standard GNNs in node classification, link prediction, and specialized applications in biomedical and materials science.

Kolmogorov-Arnold Graph Neural Networks (KAGNNs) are a class of graph neural network architectures that instantiate the Kolmogorov–Arnold superposition theorem within the message-passing or convolutional paradigm of GNNs. By systematically replacing traditional linear weights and fixed activation functions with learnable univariate non-linear maps—typically parameterized as splines, polynomials, or Fourier bases—KAGNNs achieve enhanced representational capacity, theoretical universality, and explicit interpretability over standard MLP-based GNNs. These models have demonstrated superior performance and interpretability across node classification, link prediction, graph classification/regression, contrastive learning, and domain-specific biomedical and material science applications.

1. Mathematical Foundations: Kolmogorov–Arnold Decomposition

KAGNNs are motivated by the Kolmogorov–Arnold superposition theorem, which states that any continuous multivariate function f:[0,1]nRf:[0,1]^n \to \mathbb{R} can be decomposed as

f(x1,,xn)=q=12n+1Φq(p=1nϕq,p(xp))f(x_1, \ldots, x_n) = \sum_{q=1}^{2n+1} \Phi_q\left(\sum_{p=1}^n \phi_{q,p}(x_p)\right)

where ϕq,p\phi_{q,p} and Φq\Phi_q are continuous univariate functions. Unlike conventional neural networks that employ weight matrices and fixed activations, a Kolmogorov–Arnold Network (KAN) replaces each scalar weight by a trainable univariate function, effectively expanding the expressivity and flexibility of each connection (Kiamari et al., 10 Jun 2024, Zhang et al., 19 Jun 2024, Carlo et al., 26 Jun 2024). This approach provides a universal function approximator using only sums and compositions of univariate nonlinearities (Carlo et al., 26 Jun 2024).

2. General KAGNN Layer Designs and Algorithmic Structure

Within KAGNNs, the core layer replaces the typical linear/MLP transformation and elementwise activation with a bank of learnable univariate functions. Specifically, for an input vector xRd\mathbf{x} \in \mathbb{R}^d, the KAN layer computes

yj=i=1dϕj,i(xi)y_j = \sum_{i=1}^{d} \phi_{j,i}(x_i)

where each ϕj,i:RR\phi_{j,i}: \mathbb{R} \to \mathbb{R} is parameterized as a spline, polynomial, or (optionally) via alternative bases such as radial basis functions or Fourier series (Kiamari et al., 10 Jun 2024, Carlo et al., 26 Jun 2024, Li et al., 15 Oct 2024, Afia et al., 17 May 2025, Bresson et al., 26 Jun 2024).

KAGNN layers are inserted into standard GNN architectures in place of MLPs within node-update or attention-scoring modules. For message-passing GNNs, both GCN- and GIN-like variants implement

hv(+1)=KAN(uN(v)aggregation(hu()))h_v^{(\ell+1)} = \text{KAN}\left(\sum_{u\in \mathcal N(v)} \text{aggregation}(h_u^{(\ell)})\right)

Multiple KAGNN layers can be stacked, and readout can either be a standard permutation-invariant pooling or an additional KAN layer (Bresson et al., 26 Jun 2024, Li et al., 15 Oct 2024, Wang et al., 21 May 2025).

Architectural innovations include:

  • Spline-based edge and node activations: Each edge or node update employs a learnable spline function, with coefficients learned by backpropagation (Carlo et al., 26 Jun 2024).
  • KAN-augmented attention: The neighbor-scoring function in attentive GNNs (e.g., GAT) is replaced by a KAN module, resulting in Kolmogorov–Arnold Attention (KAA), which can universally approximate any ranking over the neighbors (Fang et al., 23 Jan 2025).
  • Domain-adaptive basis selection: B-splines, RBFs, Fourier, and Jacobi polynomial bases are all used to instantiate KAN layers, depending on the smoothness/structure of the application (Li et al., 15 Oct 2024, Afia et al., 17 May 2025).

3. Learning, Parameterization, and Expressivity

Each univariate map is parameterized as a weighted sum of basis functions. For B-splines of degree pp on a grid of gg intervals: ϕ(x)=wbb(x)+wsk=1KckBk(x)\phi(x) = w_b \cdot b(x) + w_s \cdot \sum_{k=1}^K c_k B_k(x) where b(x)b(x) is a fixed residual basis (e.g., SiLU), BkB_k are spline basis functions, and wb,ws,ckw_b, w_s, c_k are trainable parameters (Kiamari et al., 10 Jun 2024, Carlo et al., 26 Jun 2024, Li et al., 15 Oct 2024). Fourier-based KANs represent each ϕj,i\phi_{j,i} as

ϕj,i(x)=k=1K(Ak,j,icos(kx)+Bk,j,isin(kx))\phi_{j,i}(x) = \sum_{k=1}^K \left( A_{k,j,i}\cos(kx) + B_{k,j,i}\sin(kx) \right)

yielding explicit capacity control and interpretability (Li et al., 15 Oct 2024).

The expressive power of KAGNNs is formally connected to the maximum ranking distance (MRD) metric for neighbor scoring: a single-layer KAN with zero-order B-splines can approximate any permutation of neighbors, surpassing the expressivity of linear or shallow MLP scoring (Fang et al., 23 Jan 2025). This universality is reflected in both theoretical analysis and consistent empirical gains across tasks, especially under parameter constraints or low-label regimes (Carlo et al., 26 Jun 2024, Bresson et al., 26 Jun 2024).

4. Empirical Results and Application Benchmarks

KAGNNs consistently outperform or match state-of-the-art conventional GNNs in node, link, and graph-level tasks:

Dataset/Task GCN GIN GAT KAGNN Variant Result Type
Cora (classif.) 76.3% 60.0% 78.9% KAGIN 81.2%, KAGIN 76.2% Accuracy
PubMed (classif.) 77.4% 78.2% KAGIN 81.0% Accuracy
MUTAG (graph classif.) GIN 85.1% 75.1% KAGIN 85.5% Accuracy
ModelNet40 (3D, OA) 84.5% (MLP-DG) Jacobi-KAN 87.3% Overall Accuracy
CHILI-3K (materials) 0.367/0.496* 0.587* KAGCN 0.995, KAEdgeCNN 0.976 F1 (classification)
ADNI (Alzheimer's) 57.4% (GCN) GCN-KAN 62.6% Accuracy
Multi-omics (cancer) 95.5% (1D CNN) 94.6% (GCNN) MOGKAN 96.3% Accuracy

* denotes task-specific GIN/EdgeCNN on atom-type task KAGNNs also achieve strong performance in graph regression (e.g., reduced MAE on ZINC and QM9), and contrastive self-supervised graph learning (up to +2% ROC-AUC over GraphCL on MoleculeNet) (Carlo et al., 26 Jun 2024, Li et al., 15 Oct 2024, Wang et al., 21 May 2025, Volzhin et al., 22 Dec 2025, Ding et al., 1 Apr 2025, Alharbi et al., 29 Mar 2025).

KAGNNs show particular strength in non-Euclidean and scientific domains: molecular property prediction (Li et al., 15 Oct 2024), complex multi-omics classification (Alharbi et al., 29 Mar 2025), inorganic nanomaterial discovery (Volzhin et al., 22 Dec 2025), and neuroimaging-based diagnostics (Ding et al., 1 Apr 2025).

5. Interpretability and Theoretical Insights

A key property of KAGNNs is inherent interpretability. Since all nonlinear transformations are explicit, learnable univariate functions, it is possible to:

  • Plot the learned splines or polynomial expansions to directly observe the transformation applied to input features or aggregated neighbor representations (Carlo et al., 26 Jun 2024).
  • Inspect attention and importance attributions: In KAA-based attention, the learned B-spline coefficients for each neighbor map directly correspond to ranking importance, yielding nearly arbitrary ranking capacity and fidelity (Fang et al., 23 Jan 2025).
  • Direct feature/biomarker identification: Models such as MOGKAN enable attribution of output decisions to individual feature transformations, validated via biological pathway analysis (Alharbi et al., 29 Mar 2025).

Pruning of nearly-zero spline coefficients can further simplify models and support symbolic interpretability (Carlo et al., 26 Jun 2024).

6. Limitations, Hyperparameter Sensitivity, and Ongoing Challenges

Despite empirical and interpretability gains, KAGNNs introduce several new challenges:

  • Computational overhead: Evaluating splines (and especially high-order polynomials or large basis expansions) is substantially slower than standard matrix multiplications, with per-epoch costs 10–100× higher in some settings (Carlo et al., 26 Jun 2024, Kiamari et al., 10 Jun 2024, Afia et al., 17 May 2025).
  • Parameter efficiency vs. expressivity trade-off: Larger grid size, spline order, or basis expansions improve representational power but rapidly increase memory and learning complexity (Bresson et al., 26 Jun 2024, Volzhin et al., 22 Dec 2025).
  • Hyperparameter selection: Grid size, basis order, and basis type require careful tuning; higher-order polynomial bases are not always beneficial and may introduce overfitting or instability (Afia et al., 17 May 2025).
  • Scaling: Some architectures (e.g., KAEdgeCNN on large graphs) present RAM bottlenecks, motivating further development of lightweight or sparsely parameterized KAN kernels (Volzhin et al., 22 Dec 2025).
  • Sensitivity to input normalization: Some datasets (e.g., ENZYMES) require feature normalization for stable KAGNN training (Bresson et al., 26 Jun 2024).

Future research includes efficient GPU kernels for spline evaluation, adaptive basis choice, residual or attention-enhanced message passing, and large-scale deployment (Zhang et al., 19 Jun 2024, Bresson et al., 26 Jun 2024, Volzhin et al., 22 Dec 2025, Li et al., 15 Oct 2024, Wang et al., 21 May 2025).

7. Broader Impact and Application Scope

KAGNNs have found rapid adoption in emerging scientific ML domains where the function to be learned is known to be highly structured and non-Euclidean, and where interpretability is essential. They enable state-of-the-art accuracy in drug discovery, materials science, medical diagnostics, and multi-omics integration, and have established new SOTA results on large molecular and materials datasets (Li et al., 15 Oct 2024, Volzhin et al., 22 Dec 2025, Alharbi et al., 29 Mar 2025). The model’s flexibility in basis choice—B-splines, Fourier, RBFs, Jacobi polynomials—allows domain-informed inductive biases, while the explicit function-form design unlocks transparent post-training analysis.

The KAGNN paradigm unifies theoretical universality, strong empirical results, and interpretability in graph deep learning, opening new avenues for the principled design of expressive, transparent GNN architectures (Kiamari et al., 10 Jun 2024, Carlo et al., 26 Jun 2024, Zhang et al., 19 Jun 2024, Fang et al., 23 Jan 2025, Volzhin et al., 22 Dec 2025).

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to Kolmogorov-Arnold Graph Neural Networks (KAGNNs).