A-PIKAN: Kolmogorov–Arnold Neural Architecture

Updated 23 December 2025

A-PIKAN is a neural architecture that uses the Kolmogorov–Arnold superposition theorem to decompose multivariate functions into parallel univariate mappings while embedding physical law constraints.
It employs a TruncatedPolynomialLayer and eight parallel univariate MLP branches to expand input features and enforce physics-based training, enhancing stability and interpretability.
Quantitative results show that PIKAN achieves MAE ≈ 10.50 and R² ≈ 0.96, matching conventional PINNs while providing smoother physics-residual loss curves and branch-level insights.

The Physics-Informed Kolmogorov–Arnold Network (PIKAN) is a neural architecture that integrates universal function approximation theory with physical law constraints, specifically designed for small-data settings in structural engineering. Developed within the context of image-based prediction of spaghetti bridge load-bearing capacity, PIKAN extends the standard physics-informed neural network (PINN) framework by replacing the classical multilayer perceptron (MLP) with a Kolmogorov–Arnold Network (KAN)-style architecture. This approach leverages the Kolmogorov–Arnold superposition theorem to achieve universal approximation via parallel, univariate neural branches, while explicitly encoding governing equations from structural mechanics in its training objective. The result is a model that matches the accuracy of conventional PINNs while providing greater training stability and interpretability in physically constrained settings (Khan et al., 27 Oct 2025).

1. Architectural Overview and Theoretical Foundation

PIKAN’s architecture consists of several stages that jointly enforce data fidelity and the satisfaction of physical constraints. The input, a standardized vector of geometric and material parameters, is first expanded using a TruncatedPolynomialLayer that augments features up to degree three (including squared, cubed, and cross-terms). This expanded feature vector is then processed by $Q=8$ parallel univariate MLP branches, each mapping the feature vector through a small network (MLP $[32 \rightarrow 16]$ ) with tanh activations, batch normalization, and dropout. The resulting univariate outputs are concatenated and passed to a final aggregation MLP $[64 \rightarrow 32]$ with ReLU activations and dropout.

PIKAN explicitly implements a decomposition of the target function as a sum of univariate mappings, in alignment with the Kolmogorov–Arnold representation, which states that any continuous function $f:[0,1]^n \to \mathbb{R}$ can be written as a superposition of univariate functions:

$f(x_1, \ldots, x_n) = \sum_{j=1}^{2n+1} \Phi_j\left( \sum_{i=1}^n \psi_{ij}(x_i) \right)$

PIKAN approximates this with $Q=8$ branches for typical input dimensionality $n \approx 3$ –5.

Compared to a standard PINN, which uses a single feedforward MLP (e.g., $64$–$128$–$64$ units, ReLU), PIKAN structurally enforces multivariate decomposition and achieves more stable training for physics-residual terms.

2. Physics-Informed Learning Objective

PIKAN minimizes a total loss that combines a data-misfit component and a physics-residual component:

$L_{\text{total}} = \lambda_{\text{data}} L_{\text{data}} + \lambda_{\text{phys}} L_{\text{phys}}$

where data loss is the mean squared error (MSE) between predicted and observed quantities, and the physics loss $L_{\text{phys}}$ consists of summed MSEs over residuals of multiple structural mechanics constraints. Example physics constraints include:

$\sigma = F/A$ (stress)
$\delta = FL/(AE)$ (beam deflection)
Constitutive relations (Hooke’s Law, Von Mises criterion)
Shear and buckling limits

Implemented as:

$L_{\text{phys}} = \text{MSE}(\sigma_{\text{pred}} - F/A, 0) + \text{MSE}(\delta_{\text{pred}} - FL/(AE), 0) + \cdots$

with typical weighting coefficients $\lambda_{\text{data}} = 0.7$ , $\lambda_{\text{phys}} = 0.3$ .

No additional $L^2$ penalties are employed beyond architectural dropout.

3. Input Parameterization and Feature Extraction

The model accepts input vectors $x$ encoding both geometry and material attributes:

Beam lengths, diameters, angles, beam count
Material density $\rho$ , Young’s modulus $E$ , yield strength $\sigma_y$

Parameter acquisition is dual-mode:

Manual entry through a web interface
Automatic extraction via a computer vision pipeline using grayscale conversion, Gaussian blur, Laplace of Gaussian (LoG) edge detection, binary masking, FAST corner detection, spatial clustering, and Euclidean conversion to metric scales

Extracted features are z-score standardized before polynomial feature expansion.

4. Training Procedure and Data Augmentation

PIKAN is trained with the Adam optimizer (learning rate $1 \times 10^{-3}$ ), batch size $8$, for $80$ epochs on a dataset of $100$ augmented samples generated from $15$ manually characterized bridge models. The training split reserves $20\%$ for validation. Data augmentation involves:

$\pm 10\%$ random variation in geometric parameters
Additive Gaussian noise
Physics-consistent rescaling, e.g., adjusting weight proportional to length and cross-sectional area

Feature standardization is applied prior to training. No early stopping is implemented; convergence typically occurs by epoch $80$.

5. Quantitative Performance and Comparative Results

On the held-out validation set, PIKAN and standard PINN exhibit nearly identical performance:

Mean Absolute Error (MAE): $10.50$
$R^2$ : $0.9600$ (PIKAN), $0.9603$ (PINN)

Baseline models perform significantly worse:

Model	MAE	$R^2$
Standard PINN	10.50	0.9603
PIKAN	10.50	0.9600
Baseline MLP±ReLU	15.87	0.9211
Linear Regression	25.43	0.7653

Both PINN and PIKAN account for approximately $96\%$ of the variance, reducing MAE by about one third compared to the best unregularized MLP. PIKAN further distinguishes itself by delivering more stable physics-residual loss curves, avoiding extreme value spikes, and enabling branch-level interpretability.

6. Interpretability, Stability, and Adaptability

Each parallel univariate branch in PIKAN can be analyzed as a distinct mapping, affording diagnostics into which input components drive predictions. The Kolmogorov–Arnold-inspired structure yields smoother training dynamics, though at a computational cost approximately $1.8\times$ that of a basic single-MLP PINN configuration. The explicit embedding of eight physics constraints greatly mitigates overfitting even in small-data scenarios. The architecture’s modularity permits re-use for diverse structural prediction tasks by substituting appropriate physics residuals in $L_{\text{phys}}$ while retaining the polynomial expansion and univariate-branch design.

7. Practical Considerations and Extensions

To optimize A-PIKAN’s practical implementation:

Adjusting $\lambda_{\text{phys}} \in [0.3, 0.5]$ or modifying polynomial degree enables tailored trade-offs between empirical fit and physics enforcement
Calibration of metric conversion steps in the computer vision pipeline is necessary for robust deployment under varying imaging conditions
Uncertainty quantification may be addressed in future variants through Monte Carlo dropout or deep ensembles, providing prediction intervals relevant to early-stage structural design
Interpretability at the branch level suggests PIKAN may serve as a diagnostic tool in lightweight bridge or truss design scenarios

PIKAN’s architecture thus delivers universal function approximations grounded in physical law, enables reliable estimation of structural performance on limited datasets, and offers a template for broader application across small-data, physics-constrained engineering tasks (Khan et al., 27 Oct 2025).

Markdown Report Issue Upgrade to Chat

References (1)

Seeing Structural Failure Before it Happens: An Image-Based Physics-Informed Neural Network (PINN) for Spaghetti Bridge Load Prediction (2025)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to A-PIKAN.