SineKAN: Adaptive Sinusoidal Neural Nets
- SineKAN is a neural network architecture grounded in the Kolmogorov-Arnold theorem that uses adaptive grids of sinusoidal functions as learnable activation units.
- It parameterizes both inner and outer functions as weighted sums of sinusoids, ensuring universal approximation and high empirical accuracy in vision and quantum many-body applications.
- The design significantly reduces computational costs compared to spline-based counterparts, enabling efficient scaling and robust performance in complex function approximation tasks.
SineKAN is a neural network architecture grounded in the Kolmogorov-Arnold representation theorem, distinguished by its use of adaptive grids of sinusoidal functions (sines) as learnable, univariate activation units. Unlike classical multilayer perceptrons (MLPs) or earlier Kolmogorov-Arnold Networks (KANs) that employ basis splines, SineKAN parameterizes both inner and outer functions of the Kolmogorov-Arnold decomposition as sums of sines with learnable amplitudes and frequencies. This approach achieves provably universal approximation, high empirical accuracy on tasks ranging from vision to quantum many-body physics, and a significant reduction in computational cost relative to spline-based KANs and certain dense neural architectures (Reinhardt et al., 2024, Gleyzer et al., 1 Aug 2025, S et al., 3 Mar 2025, Shamim et al., 2 Jun 2025).
1. Theoretical Foundations: Kolmogorov–Arnold Representation
The Kolmogorov–Arnold theorem states that any continuous function can be expressed as
where the inner functions and the outer functions are univariate and continuous. SineKAN realizes both and as finite, learnable expansions in sine bases—each being a weighted sum of sinusoidal functions with possibly learnable frequencies and fixed or learnable phase offsets (Gleyzer et al., 1 Aug 2025, Reinhardt et al., 2024, S et al., 3 Mar 2025). This ensures that, with sufficient expressivity (i.e., large enough sine bases), SineKAN can approximate any continuous multivariate function on a compact domain.
The Sinusoidal Approximation Theorem for Kolmogorov-Arnold Networks formalizes this: For any continuous and any , there exists a two-layer composition using sums of weighted sinusoids that approximates within (Gleyzer et al., 1 Aug 2025).
2. Architecture and Parameterization
SineKAN generalizes KAN layers by replacing B-spline edge activations with learnable grids of sinusoids. In its prototypical form, each edge from neuron 0 in layer 1 to neuron 2 in layer 3 carries an activation
4
where 5 are the learnable amplitudes and 6 are phase shifts, typically initialized on a uniform grid covering 7 (Reinhardt et al., 2024). Nodes sum their incoming edges and apply a learnable or fixed bias: 8 Stacking several such layers yields a deep SineKAN.
In specialized architectures:
- Vision Transformers (ViKANformer): The feed-forward block in each Transformer layer replaces MLPs with SineKAN—conducting dimension-wise sine expansions (per input channel), concatenating outputs, and linearly projecting to the output dimension. For input dimension 9 and sine basis size 0, the parameter count per layer is 1 plus optional bias (S et al., 3 Mar 2025).
- Neural Quantum States: SineKAN is applied as a variational wavefunction ansatz for many-body quantum systems, where spins are passed through several layers of SineKAN with shared sine grids and learnable frequencies to represent quantum states efficiently with manageable parameter counts (Shamim et al., 2 Jun 2025).
3. Training, Optimization, and Hyperparameters
Standard training procedures for SineKAN employ cross-entropy or 2 error losses, Adam or trust-region least-squares optimizers, and mini-batch stochastic updates. Hyperparameters include the number of sinusoidal basis functions per edge (3 or 4), learning rates (e.g., 5 to 6), batch size (128 for MNIST), and initialization schemes—amplitudes drawn from small random distributions and phase grids initialized to evenly cover the domain (Reinhardt et al., 2024, S et al., 3 Mar 2025). No dropout or batch normalization is generally required with stable phase-grid initialization.
In the quantum physics use case, variational Monte Carlo is employed for energy minimization, with parallel Markov chains and problem-specific learning rate scheduling (Shamim et al., 2 Jun 2025).
4. Empirical Performance and Benchmarks
Vision Tasks
On image classification (MNIST), SineKAN consistently matches or outperforms B-spline KANs and other non-spline KAN expansions:
- For hidden layer sizes 128–512 and 30-epoch training, SineKAN attains accuracy up to 0.9855, outperforming B-spline KAN at high widths (Reinhardt et al., 2024).
- Depth scaling shows continued improvements as the number of SineKAN layers increases, which is not observed for B-spline KANs.
- In ViKANformer, SineKAN achieves 97.8% test accuracy, F1 score 0.9789, and ROC AUC 0.9996 in 9 minutes per epoch. This slightly trails vanilla KAN but outperforms FourierKAN and matches Fast-KAN at substantially lower training overhead (S et al., 3 Mar 2025).
Quantum Many-Body Physics
- For Neural Quantum States representing large 1D spin chains (up to 7), SineKAN matches or exceeds the accuracy of restricted Boltzmann machines (RBMs), LSTM-based models, and standard MLPs, while closely approaching Density Matrix Renormalization Group (DMRG) results (Shamim et al., 2 Jun 2025).
- With 8 parameters (e.g., 86,433 for 9), SineKAN achieves ground-state energy relative errors at the 0 level and can represent complex sign and amplitude structures.
- Reflection-symmetric variants further improve fidelity in near-degenerate or critical regimes.
Analytical Function Approximation
- On smooth, oscillatory, and singular function interpolation tasks, SineKAN outperforms classical Fourier series and matches or betters sine-activated MLPs, often achieving 1–2 orders of magnitude lower error per parameter (Gleyzer et al., 1 Aug 2025).
| Model/Variant | MNIST Accuracies (h=128) | Speed-up vs. B-SplineKAN | Params (L=100 QM) |
|---|---|---|---|
| SineKAN | 0.9831–0.9855 | 4–9× | 86,433 |
| B-SplineKAN | 0.9835 | 1× (baseline) | — |
| RBM (quantum) | — | — | 1.29M |
| LSTM (quantum) | — | — | 83,240 |
5. Computational Cost, Scaling, and Trade-offs
SineKAN layers, by leveraging the Kolmogorov–Arnold structure and using adaptive sine bases, concentrate representational power into a small number of univariate (frequency) channels. Parameter counts scale as 1 for 2 input dimensions, or as dictated by the Kolmogorov theorem (3 outer nodes, 4 inner nodes). Empirically:
- SineKAN achieves 4–9× faster inference than optimized B-spline KANs at equal basis size, as each edge’s B-spline lookup is replaced by efficient sinusoidal evaluation (Reinhardt et al., 2024).
- In ViKANformer, SineKAN incurs 5–8× greater GPU time per epoch than an MLP block; inference latency per layer is proportional to 5 but remains practical for modest 6 (S et al., 3 Mar 2025).
- Training remains tractable up to very large quantum systems owing to parameter and memory efficiency.
Potential optimizations include GPU-custom kernels for sine evaluations, pruning of frequency channels, and the application of SineKAN only to select input dimensions or via hybrid architectures (S et al., 3 Mar 2025).
6. Advantages, Limitations, and Extension Directions
Advantages:
- Proven universal approximation and adaptivity for arbitrary continuous mappings via the sinusoidal Kolmogorov–Arnold structure (Gleyzer et al., 1 Aug 2025).
- Superior empirical accuracy per parameter on vision, function approximation, and quantum NQS tasks.
- Significant inference speedups and training tractability compared to spline KANs and (in selected regimes) dense MLPs.
Limitations:
- Interpretability lags behind spline-based KANs, which offer sparsifiability and partial symbolic understanding.
- Specific initialization of phase grids is critical; misconfiguration may slow or destabilize training (“R(g) scaling law”).
- Computational cost per layer, though efficient versus spline-based KANs, exceeds that of standard ReLU or affine activations by a small constant due to trigonometric evaluation.
Future Directions:
- Incorporation of learnable frequencies and mixed sine/spline bases to further improve representational flexibility (Reinhardt et al., 2024).
- Integration with residual attention or Transformer-style skip connections to enable very deep, stable SineKANs.
- Application in complex-valued settings (e.g., directly learning sign structures in frustrated quantum matter) and in architectures enforcing additional symmetries or invariances (Shamim et al., 2 Jun 2025).
- Scaling to larger-scale vision tasks, structured predictions, and physical systems beyond one-dimensional quantum chains.
7. Context in the Neural Network Landscape
SineKAN occupies a unique position between traditional MLPs (universality via dense, layered compositions and fixed nonlinearities), KANs with spline bases (interpretability, universality but higher compute cost), and periodic/frequency-based methods such as Fourier transform approximations. SineKAN’s adaptive-frequency sinusoidal units provide a principled, theoretically complete, and computationally viable path to multivariate function representation, with empirical results validating its capabilities and highlighting clear trade-offs. The reflection-symmetric SineKAN (“rSineKAN”), in particular, excels in tasks requiring symmetry-adapted representations, as in quantum spin systems (Shamim et al., 2 Jun 2025). Further exploration of SineKAN within hybrid deep architectures is an active and promising direction.