Papers
Topics
Authors
Recent
2000 character limit reached

Kolmogorov-Arnold Neural Operator (KANO)

Updated 4 January 2026
  • Kolmogorov-Arnold Neural Operator (KANO) is an operator-learning framework that employs the Kolmogorov-Arnold theorem to achieve universal operator approximation with built-in symbolic interpretability.
  • It replaces traditional node-based activations with learnable univariate functions, enhancing accuracy and efficiency in simulating high-dimensional PDE systems and physical phenomena.
  • KANO integrates dual-domain parametrizations and efficient training methods, offering significant improvements over MLPs and Fourier Neural Operators in diverse scientific applications.

A Kolmogorov-Arnold Neural Operator (KANO) is an operator-learning architecture that exploits the Kolmogorov-Arnold superposition theorem to construct interpretable neural surrogates for high-dimensional mappings and solution operators—especially in scientific computing, physics-informed learning, and complex data-driven modeling. KANO generalizes Kolmogorov-Arnold Networks (KAN) from function approximation to operator approximation, enabling both universal expressivity and symbolic interpretability, and surpassing traditional architectures such as multilayer perceptrons (MLPs) and Fourier Neural Operators (FNOs) in versatility and accuracy across a diverse range of application domains (Lee et al., 20 Sep 2025, Abueidda et al., 2024, Faroughi et al., 30 Jul 2025).

1. Mathematical Foundations and KAN-to-Operator Extension

The mathematical basis of KANO is the Kolmogorov-Arnold representation theorem: every continuous multivariate function f:[0,1]dRf:[0,1]^d\to\mathbb{R} admits a representation

f(x1,...,xd)=i=12d+1ψi(j=1dϕi,j(xj)),f(x_1, ..., x_d) = \sum_{i=1}^{2d+1} \psi_i\left( \sum_{j=1}^d \phi_{i,j}(x_j) \right),

where all ϕi,j\phi_{i,j} and ψi\psi_i are continuous (often smooth) univariate functions. KAN implements this structure by replacing standard node-based activations with learnable, edge-wise univariate nonlinearities—parameterized by splines, radial basis functions (RBFs), piecewise polynomials, sines, or rational functions—followed by summation at each node (Abueidda et al., 2024, Warin, 2024, Li et al., 28 Dec 2025, Reinhardt et al., 2024).

KANO extends this framework to operator learning, seeking mappings G:AU\mathcal{G}:\mathcal{A}\to\mathcal{U} between function spaces. In canonical architectures such as DeepOKAN or PO-CKAN, KANO replaces the branch and trunk MLPs with KAN blocks, and all learnable submodules encode univariate response curves for each edge. For pseudo-differential, dual-domain KANO (see dual spatial-spectral parametrization below), the operator symbol p(x,ξ)p(x,\xi) is constructed via KAN modules, ensuring interpretable decomposition and symbolic extraction (Lee et al., 20 Sep 2025).

2. Architectural Mechanisms

2.1. KANO Layer Formulation

A typical deep KANO comprises LL layers, each mapping from Rnl1\mathbb{R}^{n_{l-1}} to Rnl\mathbb{R}^{n_l} by

yi=Ψi(j=1nl1ψi,j(xj)),i=1,...,nl,\mathbf{y}_i = \Psi_i\left( \sum_{j=1}^{n_{l-1}} \psi_{i,j}(x_j) \right), \quad i=1,...,n_l,

with parameterization of each ψi,j\psi_{i,j} as a linear combination of basis functions (splines, RBFs, sines, rational functions) (Abueidda et al., 2024, Warin, 2024, Reinhardt et al., 2024). DeepOKAN employs Gaussian RBF expansions for each edge; PO-CKAN specializes to chunkwise shared rational functions (ERU activations) for computational efficiency (Wu et al., 9 Oct 2025). SineKAN constructs edge-wise sinusoidal expansions with grid-controlled phases, while P1-KAN leverages piecewise affine (finite-element) basis functions for stabilization on irregular tasks (Warin, 2024, Reinhardt et al., 2024).

2.2. Operator Learning (Branch–Trunk Construction)

KANO-based operator learners typically use a branch–trunk fusion, following the DeepONet paradigm: Gθ[v](ξ)=k=1rβk(v)τk(ξ),\mathcal{G}_\theta[v](\xi) = \sum_{k=1}^r \beta_k(v) \, \tau_k(\xi), where βk\beta_k and τk\tau_k are outputs of KAN-structured branch and trunk networks respectively, each handling sampled (branch input) and spatial/temporal (trunk input) query points (Faroughi et al., 30 Jul 2025, Abueidda et al., 2024, Wu et al., 9 Oct 2025). Fusion is via summation (dot product), yielding the operator’s pointwise prediction.

2.3. Dual-Domain Pseudo-Differential Parametrization

Recent advances in KANO introduce joint spectral and spatial bases via pseudo-differential quantization: LKANO[a](x)=Φ(ξΞyYei(xy)ξp(x,ξ)a(y),a(x)),L_{\rm KANO}[a](x) = \Phi \left( \sum_{\xi \in \Xi} \sum_{y \in \mathcal{Y}} e^{i(x-y)\cdot \xi} \, p(x, \xi) a(y), \, a(x) \right ), where p(x,ξ)p(x,\xi) is a trainable symbol encoded as a KAN expansion, allowing direct control over both position and frequency transfer characteristics (Lee et al., 20 Sep 2025). This construction bypasses the spectral bottleneck of FNO by representing operators that are sparse or localized in either basis, facilitating closed-form symbolic recovery and interpretable decompositions.

3. Theoretical Expressivity and Generalization

3.1. Universal Approximation

KAN architectures inherit the universal function approximation properties guaranteed by the superposition theorem, and with appropriate edge-wise parametrizations, KANO extends these guarantees to operator learning: any continuous operator with sufficiently regular symbol can be approximated to arbitrary accuracy by a finite KANO composition (Warin, 2024, Lee et al., 20 Sep 2025). Polynomial rates in the approximation error are retained for Sobolev-regular symbols.

3.2. Generalization Bounds

Rigorous generalization theory for KANO shows that excess risk scales with the product of operator-norm bounds, 1\ell_1 norms of coefficient matrices, and univariate Lipschitz constants, rather than explicit layer widths or basis counts outside logarithmic factors (Zhang et al., 2024, Li et al., 28 Dec 2025). In the RKHS setting, generalization error bounds depend polynomially on the ranks and radii of the underlying Hilbert spaces. Empirical scaling aligns closely with these theoretical complexity indicators, suggesting regularization of these quantities (e.g., coefficient norms, Lipschitz penalties, rank constraints) directly controls generalization (Zhang et al., 2024).

3.3. Spectral–Spatial Bottlenecks

Unlike the FNO, which suffers an exponential parameter blowup when representing position-dependent operators due to algebraic Fourier tails, KANO’s dual-domain structure admits polynomial scaling in model complexity for a broad operator class (Lee et al., 20 Sep 2025). Symbolic interpretability and univariate function visualization are direct outcomes of edge-based parametrizations, promoting transparency in learned PDE surrogates and latent physical models.

4. Training Methodologies and Optimization

KANOs are trained via stochastic gradient descent (Adam, AdamW), often augmented by L-BFGS or Broyden methods for second-order optimization (Faroughi et al., 30 Jul 2025). Loss functions are task-specific: root-mean-square deviation for data-driven surrogates, composite residual plus boundary/initial condition losses for physics-informed scenarios, and KL-divergence for quantum operator inference (Abueidda et al., 2024, Wu et al., 9 Oct 2025, Li et al., 28 Dec 2025, Lee et al., 20 Sep 2025). Early stopping, mesh-wise dropout, and 1\ell_1-type regularization are deployed to manage complexity and prevent overfitting (Zhang et al., 2024). Hyperparameter selection includes choice of edge-basis, number of grid points/knot locations, chunk granularity (for chunkwise KANs), and fusion dimensionality (Wu et al., 9 Oct 2025, Reinhardt et al., 2024).

5. Computational Efficiency and Scaling

Chunkwise rational KANs (e.g., PO-CKAN) shrink quadratic parameter scaling to linear by sharing base activations within chunks and employing efficient ERU activations; this enables competitive expressivity at a fraction of computational cost (Wu et al., 9 Oct 2025). SineKAN achieves further speedups—up to 9×9\times relative to B-spline KANs—by using phase-grid sinusoidal activations and vectorized computation (Reinhardt et al., 2024). Meshless RBF parametrizations, as used in DeepOKAN, circumvent grid dependencies and expensive FFT projections required by FNO, facilitating applications to irregular domains (Abueidda et al., 2024). Inference overheads are sublinear in layer width and grid resolution when chunking and vectorization are deployed (Wu et al., 9 Oct 2025, Reinhardt et al., 2024).

6. Empirical Performance and Application Domains

6.1. PDE Surrogates and Mechanics

Empirical studies on 1D sinusoidal wave operators, 2D elasticity, and transient Poisson problems demonstrate that DeepOKAN (RBF-based KANO) converges faster and to lower loss than DeepONet, with order-of-magnitude improvements in L2L^2 test error (Abueidda et al., 2024, Faroughi et al., 30 Jul 2025). PO-CKAN yields 48–80% lower mean relative L2L^2 errors on nonlinear PDEs compared to PI-DeepONet, with sharper predictions of shocks and gradient features in Burgers’, Eikonal, and fractional diffusion–reaction systems (Wu et al., 9 Oct 2025).

6.2. Spectral Data and Super-Resolution

KANO-based SOTA results in image super-resolution are achieved via B-spline spline parametrizations, capturing degradation kernels and latent spectral features far more accurately than MLPs (Li et al., 28 Dec 2025). Explicit interpretability—via visualization of learned spline parameters—affords insight into physical origins of noise, blur, and spectral inflection points in remote sensing, hyperspectral, and natural images.

6.3. Irregular and Fractal Function Approximation

P1-KAN delivers robust convergence and accuracy outperforming both spline-based KANs and MLPs on highly irregular/fractal function families, and is superior in high-dimensional hydraulic valley optimization tasks (Warin, 2024). The stabilized piecewise-linear basis mitigates operational divergences encountered in ReLU-based KANs.

6.4. Quantum Hamiltonian Learning

Dual-domain symbolic KANO reconstructs quantum Hamiltonians from measurement data with symbolic formulae treated to fourth decimal precision, and significantly outperforms FNO in long-horizon quantum state fidelity tests (Lee et al., 20 Sep 2025). Symbolic recovery enables inspection of learned physical operators and their terms.

7. Limitations, Challenges, and Future Directions

Operational deployment of KANO architectures in large-scale settings faces hurdles regarding computational demands (especially for edge-wise parametrizations in high-dimensional trunks), sensitivity to mesh/basis choices, and scalability in non-periodic or complex geometric domains (Faroughi et al., 30 Jul 2025, Wu et al., 9 Oct 2025, Warin, 2024, Lee et al., 20 Sep 2025). Automated hyperparameter optimization, adaptive basis selection, and integration with non-recursive architectures remain important active development lines.

Rigorous operator-level error bounds—especially in the finite-depth, finite-width regime—are an open theoretical challenge. The extension of symbolic KANO architectures to arbitrarily irregular geometries and to operator inference in non-Euclidean spaces is an ongoing research area. Future work is anticipated to improve the robustness, scalability, and physical consistency of KANO-based frameworks (Faroughi et al., 30 Jul 2025, Warin, 2024, Zhang et al., 2024, Lee et al., 20 Sep 2025).

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to Kolmogorov-Arnold Neural Operator (KANO).