Physics-Informed Kolmogorov–Arnold Networks

Updated 17 December 2025

PIKANs are physics-informed machine learning architectures that utilize the Kolmogorov–Arnold decomposition to achieve universal approximation and enhanced local interpretability in modeling PDEs.
They integrate spline-, polynomial-, and wavelet-based edge functions, enabling robust handling of multiscale, high-frequency, and data-sparse challenges.
Advanced optimization methods, including adaptive and second-order strategies, significantly boost PIKANs’ efficiency, scalability, and noise robustness in complex scientific applications.

Physics-Informed Kolmogorov–Arnold Networks (PIKANs) are a specialized class of scientific machine learning architectures that replace the universal multilayer perceptron (MLP) ansatz of Physics-Informed Neural Networks (PINNs) with the compositional structure afforded by Kolmogorov–Arnold Networks (KANs). PIKANs leverage the Kolmogorov–Arnold representation theorem to construct deep networks whose compositional polynomial or spline-based edge functions yield both universal approximation capabilities and enhanced local interpretability in the learning of solutions to partial differential equations (PDEs), inverse problems, and scientific regression tasks (Pérez-Bernal et al., 12 Dec 2025, Toscano et al., 17 Oct 2024). They have demonstrated superior performance or complementary advantages over PINNs in multiscale, high-frequency, data-sparse, and high-dimensional scientific modeling. This article reviews their mathematical formulation, architectural innovations, training/optimization approaches, comparative performance, advanced variants, and application landscape.

1. Mathematical Foundations: Kolmogorov–Arnold Representation

At the core of PIKANs lies the Kolmogorov–Arnold representation theorem, which states that any continuous multivariate function on a compact domain can be expressed as a finite sum of univariate outer functions composed with sums of univariate inner functions: $f(x_1,\dots,x_d)\;=\;\sum_{q=1}^{2d+1} \;G_q\!\Bigl(\sum_{p=1}^d \Phi_{q,p}(x_p)\Bigr)$ where each $G_q$ and $\Phi_{q,p}$ is a continuous function of a single variable. This decomposition yields a provably universal function representation. In the neural context, each edge in a given layer of the KAN is parameterized by a learnable function, typically implemented via splines, Chebyshev polynomials, or other basis expansions (Pérez-Bernal et al., 12 Dec 2025, Toscano et al., 17 Oct 2024).

The network output is constructed as a composition of spline-based transformations: $\text{KAN}(x)\;=\;(\Phi_{L-1}\circ\cdots\circ\Phi_0)(x)$ where each layer $i$ operates as

$[\Phi_i(x)]_j = \sum_{p=1}^{n_i} \left[\sum_{m=0}^M c^{(i,j,p)}_m\,B_{m,k}(x_p)\right]$

with $B_{m,k}$ the B-spline basis and $c^{(i,j,p)}_m$ trainable coefficients.

By substituting the standard PINN’s MLP by this structure, PIKANs enable more parsimonious and interpretable function approximations, and facilitate local adaptation to problem geometry and boundary/interior features (Gong et al., 23 Aug 2025).

2. PIKAN Architectures: Variants, Edge Functions, and Hybridization

PIKANs support a range of architectural variants, driven largely by the choice of univariate function parameterization, network depth/width configurations, and hybridization schemes:

Spline-based PIKANs: Employ B-splines on each edge, yielding piecewise-polynomial activation functions. These networks are especially suited for problems with discontinuities or material heterogeneity (Pérez-Bernal et al., 12 Dec 2025, Gong et al., 23 Aug 2025).
Polynomial-based (Chebyshev, Jacobi, Fourier) PIKANs: Use Chebyshev or Jacobi polynomials for efficient representation of smooth and oscillatory features. The Chebyshev-based cPIKAN and Jacobi-PIKAN constructions are prevalent for spectral accuracy and numerical stability (Faroughi et al., 9 Jun 2025, Kashefi et al., 8 Apr 2025).
Wavelet-based and Multi-resolution PIKANs: Architectures such as HWF-PIKAN incorporate multi-resolution embeddings via wavelets and Fourier bases before the KAN core, enhancing the network’s ability to resolve both smooth and sharp features, and mitigating spectral bias (Heravifard et al., 12 Dec 2025, Patra et al., 25 Jul 2024).
Hybrid and Parallel Structures: Parallel or domain-decomposed models combine MLP and KAN branches, e.g., Hybrid Parallel Kolmogorov-Arnold/MLP PINNs (HPKM-PINN), engaging a trainable convex mixing parameter to optimally blend features of both models across subdomains (Huang et al., 14 Nov 2025, Xu et al., 30 Mar 2025).
Separable PIKANs (SPIKANs): Decompose high-dimensional problems by a separation-of-variables ansatz, representing target functions as low-rank sums of products of univariate KANs, greatly alleviating the curse of dimensionality (Jacob et al., 9 Nov 2024).

3. Physics-Informed Learning: Loss Functions, Sampling, and Implicit Boundary Handling

PIKANs are trained by minimizing composite loss functions comprising physics residuals, boundary/initial condition discrepancies, and (optionally) data misfit terms: $\mathcal{L} = \lambda_{\rm PDE} \mathcal{L}_{\rm PDE} + \lambda_u \mathcal{L}_u + \lambda_k \mathcal{L}_k + \lambda_{\rm bnd} \mathcal{L}_{u_{\rm bnd}}$ where each term quantifies squared error in the differential operator, observed variables, parameters, or boundary values (Pérez-Bernal et al., 12 Dec 2025).

In unbounded or semi-infinite domains, PIKANs employ carefully designed sampling strategies:

Infinite domains: Collocation points drawn from centered normal distributions ( $\mathcal{N}(0, \sigma^2 I)$ ) concentrate near the region of interest and induce solution stabilization far from the origin.
Semi-infinite domains: Combine normal in $x$ and exponential in $y$ to cluster training points near critical boundaries (Pérez-Bernal et al., 12 Dec 2025).

Some PIKAN deployments enforce boundary behavior implicitly by leveraging stabilization (vanishing gradients/residuals) at infinity, rather than explicit boundary constraints.

For multi-material problems, B-spline-based PIKANs can capture solution kinks and gradient jumps at material interfaces without explicit subdomain partitioning or interface penalties (Gong et al., 23 Aug 2025).

4. Training, Optimization, and Scalability

PIKANs adopt modern optimization schemes, with combinations of adaptive first-order methods (Adam) and second-order, curvature-aware quasi-Newton solvers (L-BFGS, SSBFGS, SSBroyden). Second-order strategies, especially Self-Scaled Broyden, yield orders-of-magnitude improvements in accuracy and robustness over classical BFGS/L-BFGS in stiff and multi-modal PDE landscapes, and are crucial to unlocking the expressive power of spline/polynomial expansions (Kiyani et al., 22 Jan 2025, Huang et al., 14 Nov 2025).

Trainable convex combination weights ( $\alpha$ ), overlapping domain decomposition, and adaptive loss-weighting (residual or gradient-based attention) further accelerate convergence and improve performance in multiscale and high-frequency problems (Huang et al., 14 Nov 2025, Zhang et al., 13 May 2025). JAX-based PIKAN implementations exploiting just-in-time compilation and vectorization have achieved up to 84× training speedup compared to original NumPy/PyTorch KANs (Rigas et al., 24 Jul 2024).

Deep PIKANs, particularly cPIKANs, face challenges with gradient vanishing or explosion, motivating basis-agnostic Glorot-like initialization and the development of architectures with residual/skip-gating (e.g., Residual-Gated Adaptive KANs) to traverse all learning phases and maintain stability with increasing network depth (Rigas et al., 27 Oct 2025). Multi-resolution and alternating training protocols further boost efficiency on multi-scale PDEs (Yang et al., 26 Jul 2025).

5. Benchmarking, Applications, and Comparative Performance

A wide range of benchmarking confirms the capacity of PIKANs to outperform standard PINNs and MLP-based architectures, especially in regimes of:

High-frequency or multiscale structure (Helmholtz, Allen–Cahn, reaction-diffusion equations) (Mostajeran et al., 6 Jan 2025, Huang et al., 14 Nov 2025, Heravifard et al., 12 Dec 2025),
Inverse problems in infinite/semi-infinite domains (Pérez-Bernal et al., 12 Dec 2025),
Multi-material elasticity and heterogeneous interface problems (Gong et al., 23 Aug 2025),
Efficient parameter identification and surrogates in power system dynamics (Shuai et al., 13 Aug 2024),
Data-sparse or zero-data PDE solvers in complex engineering settings (Zhang et al., 13 May 2025, Kashefi et al., 8 Apr 2025).

PIKANs typically achieve lower normalized $L^2$ errors with fewer parameters when compared to MLP or PINN models, exhibit improved robustness to noise, and, when hybridized with domain decomposition or MLP branches, reach the lowest observed test errors and fastest convergence in ablation studies (Huang et al., 14 Nov 2025, Heravifard et al., 12 Dec 2025).

Some quantifiable advantages and trade-offs (see (Pérez-Bernal et al., 12 Dec 2025, Gong et al., 23 Aug 2025)):

Task & Setting	PINN Rel. Error	PIKAN Rel. Error	Training Speedup
Inverse Poisson, unbounded domain	0.097%	1.086%	PINN ~1000× faster
2D Poisson (multi-scale; HPKM-PINN)	3.47e-3 (MLP)	6.07e-4 (KAN)	PIKAN/Hybrid fastest
Multi-material elasticity (single net)	N/A	L2(u_x)~0.7%	No domain dec. needed
Power system parameter ID (median)	>20% (PINN)	~1% (PIKAN)	40–60% fewer parameters

Noise robustness, parameter/architecture efficiency, and ability to generalize across irregular geometries or numerous domains have been empirically validated (Kashefi et al., 8 Apr 2025, Pérez-Bernal et al., 12 Dec 2025).

6. Advanced Extensions: Multi-Resolution, Attention, and Separable Formulations

PIKANs have been extended to multi-resolution spectral hybridizations (HWF-PIKAN), combining wavelet and Fourier features to explicitly counteract spectral bias and accelerate convergence for advection-dominated and kinetic equations (Heravifard et al., 12 Dec 2025). Attention-enhanced and Chebyshev polynomial-based models (AC-PKAN) integrate internal feature-wise attention and residual-gradient attention, delivering state-of-the-art results in zero-data or weakly supervised PDEs while ensuring full Jacobian rank and non-vanishing derivative properties (Zhang et al., 13 May 2025).

For high-dimensional problems, separable physics-informed KANs (SPIKANs) decompose the solution into products of one-dimensional networks, drastically reducing complexity and eliminating the curse of dimensionality, while retaining accuracy and interpretability (Jacob et al., 9 Nov 2024).

Domain scaling, residual-based collocation re-sampling, adaptive loss attention, and training set adaptation further enhance the practical utility and efficiency of the framework, with implementation validated on JAX for extreme parameter efficiency (Rigas et al., 24 Jul 2024, Mostajeran et al., 6 Jan 2025).

7. Limitations, Open Problems, and Future Directions

PIKANs’ expressive power and interpretability are balanced by computational and modeling challenges:

Per-iteration cost is higher (per epoch) than MLP-based PINNs, mainly due to the complexity of edge-wise basis function evaluation and backpropagation;
Deep PIKANs without correct initialization or skip-gating face training instability and gradient vanishing/exploding;
For very high-dimensional applications, compositional scaling (number of inner/outer functions) and collocation point selection can become intractable without recourse to separable representations or adaptive sampling (Toscano et al., 17 Oct 2024, Jacob et al., 9 Nov 2024);
Fine-tuning knot density, basis order, and network width/depth remains architecture-dependent and highly problem-specific;
Symbolic extraction for physical interpretability, while feasible, may not always yield human-readable formulae faithful to true generative structure (Shuai et al., 13 Aug 2024);
Integration of full DAE models, online/continual learning, and extension to unsteady or stochastic PDEs are ongoing research areas (Shuai et al., 13 Aug 2024, Heravifard et al., 12 Dec 2025).

Future directions include theoretical development of infinite-width/NTK limits for nested cPIKAN structures, adaptive operator learning via SPIKANs, integration with partial symbolic physics, and deployment in scientific digital twins.

PIKANs thus represent a theoretically principled and practically validated evolution in physics-informed machine learning, marrying the universality and explicit structure of the Kolmogorov–Arnold decomposition with sophisticated physics-constrained loss formulations and advanced training methodologies (Pérez-Bernal et al., 12 Dec 2025, Toscano et al., 17 Oct 2024, Huang et al., 14 Nov 2025, Gong et al., 23 Aug 2025, Zhang et al., 13 May 2025).