BSRBF-KAN: Compound B-spline & RBF Activations
- BSRBF-KAN is an architectural variant that combines B-spline and Gaussian RBF activation functions to enhance function approximation and classification.
- Experimental results show stable convergence with 97.55% on MNIST and 89.33% on Fashion-MNIST, highlighting its effective learning dynamics.
- The design employs a weighted sum of spline and RBF components, ensuring local control, global smoothness, and improved optimization efficiency.
BSRBF-KAN is an architectural variant within the Kolmogorov-Arnold Networks (KAN) family, explicitly combining B-spline basis functions and Gaussian radial basis functions (RBFs) as learnable activation functions. Designed for data-driven function approximation and classification tasks, BSRBF-KAN extends the representational flexibility and convergence behaviour characteristic of KANs by leveraging the complementary properties of both basis function families (Ta, 17 Jun 2024).
1. Architectural Framework
BSRBF-KAN is situated within the canonical KAN paradigm, which is inspired by the Kolmogorov–Arnold representation theorem:
In a typical layered KAN, each layer is structured as a matrix of univariate activation functions mapping scalar inputs through successive transformations. BSRBF-KAN constructs these activations as a weighted sum:
where , are learnable weights; is a standard base nonlinearity (usually SiLU); is a linear combination of B-spline basis functions; and is a Gaussian RBF centered at with width :
This approach provides both local control (via splines) and global smoothness/generalization (via Gaussian RBFs), allowing the architecture to fit diverse data distributions.
2. Mathematical Properties
The use of B-spline functions guarantees piecewise polynomial approximation, continuity, and local adaptability, critical for smooth function interpolation and learning. RBFs—especially Gaussian—offer isotropic, rapidly decreasing responses that excel in capturing similarity and local structure. The weighted combination in BSRBF-KAN ensures that the activation retains differentiability and is amenable to gradient-based optimization.
The overall network is a composition of such activation matrices:
where each comprises element-wise functions as described above.
3. Experimental Verification and Comparative Metrics
BSRBF-KAN was benchmarked against MLP, EfficientKAN, FastKAN, FasterKAN, and GottliebKAN on MNIST and Fashion-MNIST (Ta, 17 Jun 2024). A canonical BSRBF-KAN configuration was:
- Layer sizes: (784, 64, 10)
- Base activation: SiLU
- Training procedure: AdamW optimizer, lr=1e-3, batch size=64, weight decay=1e-4, 15 epochs.
On MNIST, average validation accuracy across 5 runs was ; on Fashion-MNIST, , with training accuracy consistently achieved. BSRBF-KAN displayed stable convergence (lowest and most consistent training/validation loss curves among the compared models) and robust performance across independent runs.
Compared to GottliebKAN (which achieved a slightly higher peak validation accuracy), BSRBF-KAN was distinguished by its combination of B-spline and RBF bases, resulting in superior error loss reduction and stability.
4. Convergence and Stability Features
BSRBF-KAN’s stability is attributed to the synergy between B-spline local adaptability and Gaussian RBF global smoothness. Training and validation losses decrease sharply, and test accuracy remains consistently high across independent runs. This is contrasted with certain other KAN variants and baseline MLPs, where training instabilities, larger accuracy variance, or plateauing loss curves were observed.
The reliable convergence is especially advantageous for data fitting problems and scenarios with non-stationary or noisy inputs.
5. Application Domains
BSRBF-KAN is suited for:
- Function Approximation: High-fidelity interpolation and regression tasks where local control and global generalization are both required.
- Image Classification: Demonstrated by high MNIST performance, suitable for tasks where pattern or shape locality matters.
- Scientific Data Fitting: Medical imaging, analytical instrument modeling, and other domains benefiting from flexible, interpretable mathematical basis functions.
- Exploration of Compound Activation Functions: Framework paving the way for combining various mathematical bases, such as polynomials, wavelets, or Fourier transforms, with spline or RBF components.
6. Directions for Refinement and Broader Research
Possible advances include (Ta, 17 Jun 2024):
- Hyperparameter Optimization: Tailoring grid sizes, center spreads, and spline orders to match data complexity.
- Network Architecture Exploration: Employing deeper/wider networks, multi-modal function combinations, or hybrid activations.
- Application Expansion: Transfer of the architecture to larger image datasets or high-dimensional scientific data.
- Computational Efficiency: Streamlining combined basis computation to mitigate overhead compared to simpler MLPs.
The combination concept has also catalyzed related research, such as FC-KAN (Ta et al., 3 Sep 2024), which utilizes richer element-wise combinations (e.g., sum, product, quadratic, cubic forms) among splines, RBFs, and wavelets for enhanced feature extraction and slightly improved accuracy at the cost of increased parameter count.
7. Implications for Activation Function Design in Neural Networks
The demonstrated stability, convergence, and competitive accuracy of BSRBF-KAN signal the importance of investigating compound and adaptive activation functions. Experimental evidence suggests that combining spline and RBF activations yields more reliable learning and better generalization than single-basis function models. This approach is also transferable: the use of spline-based activations in MLPs has been shown, in other controlled settings, to significantly enhance symbolic formula representation (Yu et al., 23 Jul 2024). The field is moving toward task-specific, interpretable architectures by leveraging adaptive mathematical basis functions beyond the conventional fixed nonlinearities.
In conclusion, BSRBF-KAN exemplifies a mathematically principled extension of Kolmogorov-Arnold Networks by fusing B-splines and Gaussian radial basis functions, achieving robust function approximation and improved stability in empirical benchmarks. This compound activation paradigm lays the groundwork for future KAN architectures integrating diverse functional bases, supporting scientific, engineering, and data analysis applications that demand both flexibility and rigor.