Symbolic Regression with KANs

Updated 14 February 2026

Symbolic regression via KANs is based on decomposing any continuous function into a sum of learnable univariate functions as per the Kolmogorov-Arnold theorem.
KAN architectures use trainable spline-based activations and structured summation layers to extract closed-form expressions that capture physical laws in systems like ODEs and constitutive models.
Recent advances incorporate sparsity regularization and LLM-guided symbolic extraction to enhance model accuracy, interpretability, and parsimony in scientific applications.

Kolmogorov-Arnold Networks (KANs) provide a principled framework for symbolic regression by operationalizing the Kolmogorov-Arnold representation theorem within a trainable neural network architecture. Symbolic regression via KANs leverages their explicit sum-of-univariates structure, allowing learned functions to be directly mapped to closed-form expressions, thereby bridging modern deep learning with interpretable, analytic modeling. This approach extends to a variety of scientific and engineering domains, enabling the discovery of governing ODEs, PDEs, constitutive laws, and other physical relations while maintaining parsimony and interpretability.

1. Mathematical and Theoretical Foundations

KANs directly instantiate the Kolmogorov-Arnold representation theorem, which asserts that any continuous function $f: \mathbb{R}^d \rightarrow \mathbb{R}$ can be decomposed as

$f(x_1, \ldots, x_d) = \sum_{q=1}^{2d+1} \Phi_q\left(\sum_{p=1}^d \psi_{q,p}(x_p)\right),$

where $\psi_{q,p}$ and $\Phi_q$ are univariate functions. In the context of KANs, this translates to an architecture where each "edge" between layers corresponds to a learnable univariate function, often parameterized as splines or basis expansions (Koenig et al., 2024, Thakolkaran et al., 7 Mar 2025, Dorji et al., 27 Aug 2025).

KANs typically employ matrix layers of such univariate activations, followed by summations along one or more axes, enabling both additive and (in later extensions) multiplicative composition of features (Liu et al., 2024). This structure naturally aligns with the modular and often sparse form of physical laws.

2. Architecture: Parameterization and Learning

A canonical KAN block consists of:

Layer-wise structure: Each layer $\ell$ consists of a matrix $[\phi_{\ell,i,j}]$ of univariate functions mapping input $x^{(\ell)}_i$ to output $x^{(\ell+1)}_j$ via

$x^{(\ell+1)}_j = \sum_{i=1}^{n_\ell} \phi_{\ell,i,j}(x^{(\ell)}_i).$

Univariate activations: Functions $\phi$ are parameterized as basis function expansions—such as B-splines, Gaussian radial basis functions, or mixtures of symbolic primitives and dense splines—with all parameters learned end-to-end (Bagrow et al., 27 Nov 2025, Koenig et al., 2024).
Compositional depth: Stacking multiple such layers yields a "deep" KAN capable of representing highly complex nonlinear relationships. Extensions introduce explicit multiplication nodes ("MultKAN") for more natural modeling of products (Liu et al., 2024).

For convexity/monotonicity constraints (critical in constitutive modeling), spline activations can be restricted using linear inequalities on control points, ensuring output functions are physically admissible (e.g., polyconvex in hyperelasticity) (Thakolkaran et al., 7 Mar 2025).

3. Symbolic Regression Pipeline with KANs

Symbolic regression with KANs proceeds in several systematic stages:

Training:
- KANs are optimized using standard loss objectives (e.g., mean squared error for regression, augmented with L₁/L₂ or entropy-based regularization for sparsity and simplicity) (Dorji et al., 27 Aug 2025, Cappi et al., 25 Aug 2025).
- Data normalization and basis/architecture hyperparameters (spline grid size, polynomial degree, number of layers/branches) are critical for stable fitting and extractable results (Jacobs et al., 27 Jan 2026).
Extraction of Symbolic Formulas:
- Once trained, each univariate activation $f(x_1, \ldots, x_d) = \sum_{q=1}^{2d+1} \Phi_q\left(\sum_{p=1}^d \psi_{q,p}(x_p)\right),$ 0 is either directly sparse or is pruned for negligible terms (via magnitude/entropy thresholds).
- Each learned $f(x_1, \ldots, x_d) = \sum_{q=1}^{2d+1} \Phi_q\left(\sum_{p=1}^d \psi_{q,p}(x_p)\right),$ 1 is fit to closed-form symbolic candidates (polynomials, exponentials, logs, trigonometric functions, etc.) by nonlinear least squares or symbolic regression libraries (e.g., PySR, SymPy) (Koenig et al., 2024, Dorji et al., 27 Aug 2025, Bagrow et al., 27 Nov 2025).
- Outer summations/compositions are reassembled, yielding a human-readable closed-form expression; further simplification (e.g., by algebraic software or post-hoc LLM-based simplification) is possible (Harvey et al., 12 May 2025).
- In S2KAN models, symbolic primitives are included directly in the activation expansion with learnable, differentiable gates, and a minimum description length objective automatically selects the simplest explanatory symbolic structure (Bagrow et al., 27 Nov 2025).
Validation and Model Selection:
- Extracted formulas are validated against held-out data (MSE, $f(x_1, \ldots, x_d) = \sum_{q=1}^{2d+1} \Phi_q\left(\sum_{p=1}^d \psi_{q,p}(x_p)\right),$ 2) and domain-consistent checks (parsimony, interpretability, alignment with physical knowledge).
- In dynamic or structured domains, derived equations are rolled out via numerical integration to compare long-horizon predictions (e.g., ODE/PDE rollouts, graph dynamics) (Cappi et al., 25 Aug 2025, Koenig et al., 2024).

4. Applications in Scientific and Engineering Domains

Symbolic regression via KANs has been demonstrated in a range of contexts:

Hidden physics and dynamical systems: KANs embedded in neural ODEs (KAN-ODEs) have learned sparse analytic source terms from rich time-series or sparse PDE data, successfully distilling equations such as the Fisher–KPP reaction term and suggesting correct parametric dependence in the Lotka–Volterra system (Koenig et al., 2024).
Constitutive modeling: Input-convex KANs (ICKANs) have yielded analytic, physically admissible hyperelastic energy laws (e.g., Neo-Hookean, Arruda–Boyce, Ogden) from noisy full-field strain data, matching classical forms and experimental validation metrics (Thakolkaran et al., 7 Mar 2025).
Composite load modeling in energy systems: KANs enabled the discovery of transparent symbolic load models that capture nonlinear dependence on measured voltages and frequencies, surpassing MLP and Random Forest baselines in both accuracy and interpretability (Dorji et al., 27 Aug 2025).
Graph dynamical systems: KANs for graphs ("GKAN-ODE") have produced interpretable, parameter-efficient laws for epidemic, biochemical, and oscillator models, clearly mapping learned splines to known dynamic forms (Cappi et al., 25 Aug 2025).
Materials property prediction and engineering: KAN-derived closed-form models have matched hand-crafted and MLP-based formulas for embrittlement or pump performance with comparable error but vastly improved interpretability and parameter efficiency (Jacobs et al., 27 Jan 2026, Peng et al., 2024).

5. Advances in Symbolic Extraction and Model Simplification

Recent developments in KAN-based symbolic regression address the fidelity and parsimony of extracted expressions:

S2KAN ("Softly Symbolified" KAN): Incorporates a symbolic dictionary into training, using hard-concrete gates and Minimum Description Length regularization to induce sparsity and select interpretable forms; S2KAN matches or exceeds standard KAN accuracy on benchmarks with notably fewer parameters and high symbolic fidelity (Bagrow et al., 27 Nov 2025).
Divide-and-conquer and structure exploitation: KAN-SR applies modular simplification (separability, symmetry, brute-force matching) prior to full KAN training, achieving superior exact symbolic recovery rates on scientific benchmarks such as the Feynman SRSD (Bühler et al., 12 Sep 2025).
Multimodal and LLM-guided extraction: Vision-capable LLMs (e.g., GPT-4O) consume 1D spline plots from KAN edges to generate symbolic ansätze, facilitating automated, low-complexity formula recovery—even for multivariate problems—by assembling and simplifying the synthetic formulas (Harvey et al., 12 May 2025).
Bidirectional science–KAN synergy ("KAN 2.0"): Enables injection of prior structural knowledge via "kanpiler," extraction of symbolic modular trees, and explicit modeling of multiplicative mechanisms for improved alignment with theoretical physics (Liu et al., 2024).

6. Regularization, Hyperparameters, and Practical Considerations

KAN-based symbolic regression relies on carefully tuned regularization and architecture choices:

Sparsity and entropy regularization promote parsimonious models and clear symbolic extraction, either via explicit penalties or structured dictionary learning (Cappi et al., 25 Aug 2025, Bagrow et al., 27 Nov 2025).
Grid size and basis selection (e.g., number of spline knots, basis function types) control the trade-off between expressivity and interpretability; over-parameterization can hinder clarity in symbolic extraction (Jacobs et al., 27 Jan 2026).
Training stability is improved via normalization, batch regularization, careful learning-rate scheduling, and cross-validation with outlier rejection to handle KANs' potential for numerical brittleness.
Pruning and symbolic matching thresholds are critical; overly aggressive pruning may miss essential terms, while lax thresholds can introduce redundancy or instability.
Comparison to gradient-based and genetic-programming symbolic regression: KANs often require fewer parameters, can yield equivalent or lower error, and produce more compact analytic forms than MLPs, tree-based models, or traditional symbolic regression tools (AI Feynman, Eureqa, PySR), particularly when structure in the data is well-aligned with the Kolmogorov-Arnold decomposition (Bühler et al., 12 Sep 2025, Panczyk et al., 4 Apr 2025).

7. Limitations and Prospects

KAN-based symbolic regression exhibits both strong capabilities and several limitations:

Scalability: The number of learnable univariate functions scales quadratically with input dimension and layer width, posing computational and extraction challenges as models grow (Liu et al., 2024).
Symbolic fidelity: While S2KAN and divisor-conquer approaches enhance symbolic accuracy, pathological or dense solutions may still occur, especially outside the training domain or when the symbolic dictionary is incomplete (Bagrow et al., 27 Nov 2025).
Extrapolation: Like other data-driven symbolic learners, KANs may diverge on out-of-distribution data; incorporating physical constraints or domain-appropriate basis expansions can mitigate, but does not eliminate, this risk (Chen et al., 2024).
Automation and interpretability: Although symbolic extraction pipelines are increasingly automated, post-hoc simplification and validation often require manual or domain-informed intervention.
Future directions: Extensions under development include rational/wavelet bases, automated hyperparameter search, foundation-model pretraining for few-shot symbolic regression, domain-specific simplification strategies, and direct integration with scientific-knowledge repositories and trusted solvers (Liu et al., 2024).

KANs, by combining the universal-approximation power of deep neural networks with a transparent, additive, and compositional architecture, have transformed the landscape of symbolic regression. They provide an effective and principled route from raw data and time series to parsimonious, interpretable, and physically meaningful formulas across a wide spectrum of scientific modeling domains (Koenig et al., 2024, Thakolkaran et al., 7 Mar 2025, Bühler et al., 12 Sep 2025, Bagrow et al., 27 Nov 2025).