A-PIKAN: Kolmogorov–Arnold Neural Architecture
- A-PIKAN is a neural architecture that uses the Kolmogorov–Arnold superposition theorem to decompose multivariate functions into parallel univariate mappings while embedding physical law constraints.
- It employs a TruncatedPolynomialLayer and eight parallel univariate MLP branches to expand input features and enforce physics-based training, enhancing stability and interpretability.
- Quantitative results show that PIKAN achieves MAE ≈ 10.50 and R² ≈ 0.96, matching conventional PINNs while providing smoother physics-residual loss curves and branch-level insights.
The Physics-Informed Kolmogorov–Arnold Network (PIKAN) is a neural architecture that integrates universal function approximation theory with physical law constraints, specifically designed for small-data settings in structural engineering. Developed within the context of image-based prediction of spaghetti bridge load-bearing capacity, PIKAN extends the standard physics-informed neural network (PINN) framework by replacing the classical multilayer perceptron (MLP) with a Kolmogorov–Arnold Network (KAN)-style architecture. This approach leverages the Kolmogorov–Arnold superposition theorem to achieve universal approximation via parallel, univariate neural branches, while explicitly encoding governing equations from structural mechanics in its training objective. The result is a model that matches the accuracy of conventional PINNs while providing greater training stability and interpretability in physically constrained settings (Khan et al., 27 Oct 2025).
1. Architectural Overview and Theoretical Foundation
PIKAN’s architecture consists of several stages that jointly enforce data fidelity and the satisfaction of physical constraints. The input, a standardized vector of geometric and material parameters, is first expanded using a TruncatedPolynomialLayer that augments features up to degree three (including squared, cubed, and cross-terms). This expanded feature vector is then processed by parallel univariate MLP branches, each mapping the feature vector through a small network (MLP ) with tanh activations, batch normalization, and dropout. The resulting univariate outputs are concatenated and passed to a final aggregation MLP with ReLU activations and dropout.
PIKAN explicitly implements a decomposition of the target function as a sum of univariate mappings, in alignment with the Kolmogorov–Arnold representation, which states that any continuous function can be written as a superposition of univariate functions:
PIKAN approximates this with branches for typical input dimensionality –5.
Compared to a standard PINN, which uses a single feedforward MLP (e.g., $64$–$128$–$64$ units, ReLU), PIKAN structurally enforces multivariate decomposition and achieves more stable training for physics-residual terms.
2. Physics-Informed Learning Objective
PIKAN minimizes a total loss that combines a data-misfit component and a physics-residual component:
where data loss is the mean squared error (MSE) between predicted and observed quantities, and the physics loss consists of summed MSEs over residuals of multiple structural mechanics constraints. Example physics constraints include:
- (stress)
- (beam deflection)
- Constitutive relations (Hooke’s Law, Von Mises criterion)
- Shear and buckling limits
Implemented as:
with typical weighting coefficients , .
No additional penalties are employed beyond architectural dropout.
3. Input Parameterization and Feature Extraction
The model accepts input vectors encoding both geometry and material attributes:
- Beam lengths, diameters, angles, beam count
- Material density , Young’s modulus , yield strength
Parameter acquisition is dual-mode:
- Manual entry through a web interface
- Automatic extraction via a computer vision pipeline using grayscale conversion, Gaussian blur, Laplace of Gaussian (LoG) edge detection, binary masking, FAST corner detection, spatial clustering, and Euclidean conversion to metric scales
Extracted features are z-score standardized before polynomial feature expansion.
4. Training Procedure and Data Augmentation
PIKAN is trained with the Adam optimizer (learning rate ), batch size $8$, for $80$ epochs on a dataset of $100$ augmented samples generated from $15$ manually characterized bridge models. The training split reserves for validation. Data augmentation involves:
- random variation in geometric parameters
- Additive Gaussian noise
- Physics-consistent rescaling, e.g., adjusting weight proportional to length and cross-sectional area
Feature standardization is applied prior to training. No early stopping is implemented; convergence typically occurs by epoch $80$.
5. Quantitative Performance and Comparative Results
On the held-out validation set, PIKAN and standard PINN exhibit nearly identical performance:
- Mean Absolute Error (MAE): $10.50$
- : $0.9600$ (PIKAN), $0.9603$ (PINN)
Baseline models perform significantly worse:
| Model | MAE | |
|---|---|---|
| Standard PINN | 10.50 | 0.9603 |
| PIKAN | 10.50 | 0.9600 |
| Baseline MLP±ReLU | 15.87 | 0.9211 |
| Linear Regression | 25.43 | 0.7653 |
Both PINN and PIKAN account for approximately of the variance, reducing MAE by about one third compared to the best unregularized MLP. PIKAN further distinguishes itself by delivering more stable physics-residual loss curves, avoiding extreme value spikes, and enabling branch-level interpretability.
6. Interpretability, Stability, and Adaptability
Each parallel univariate branch in PIKAN can be analyzed as a distinct mapping, affording diagnostics into which input components drive predictions. The Kolmogorov–Arnold-inspired structure yields smoother training dynamics, though at a computational cost approximately that of a basic single-MLP PINN configuration. The explicit embedding of eight physics constraints greatly mitigates overfitting even in small-data scenarios. The architecture’s modularity permits re-use for diverse structural prediction tasks by substituting appropriate physics residuals in while retaining the polynomial expansion and univariate-branch design.
7. Practical Considerations and Extensions
To optimize A-PIKAN’s practical implementation:
- Adjusting or modifying polynomial degree enables tailored trade-offs between empirical fit and physics enforcement
- Calibration of metric conversion steps in the computer vision pipeline is necessary for robust deployment under varying imaging conditions
- Uncertainty quantification may be addressed in future variants through Monte Carlo dropout or deep ensembles, providing prediction intervals relevant to early-stage structural design
- Interpretability at the branch level suggests PIKAN may serve as a diagnostic tool in lightweight bridge or truss design scenarios
PIKAN’s architecture thus delivers universal function approximations grounded in physical law, enables reliable estimation of structural performance on limited datasets, and offers a template for broader application across small-data, physics-constrained engineering tasks (Khan et al., 27 Oct 2025).