Papers
Topics
Authors
Recent
Search
2000 character limit reached

SineKAN: Adaptive Sinusoidal Neural Nets

Updated 4 April 2026
  • SineKAN is a neural network architecture grounded in the Kolmogorov-Arnold theorem that uses adaptive grids of sinusoidal functions as learnable activation units.
  • It parameterizes both inner and outer functions as weighted sums of sinusoids, ensuring universal approximation and high empirical accuracy in vision and quantum many-body applications.
  • The design significantly reduces computational costs compared to spline-based counterparts, enabling efficient scaling and robust performance in complex function approximation tasks.

SineKAN is a neural network architecture grounded in the Kolmogorov-Arnold representation theorem, distinguished by its use of adaptive grids of sinusoidal functions (sines) as learnable, univariate activation units. Unlike classical multilayer perceptrons (MLPs) or earlier Kolmogorov-Arnold Networks (KANs) that employ basis splines, SineKAN parameterizes both inner and outer functions of the Kolmogorov-Arnold decomposition as sums of sines with learnable amplitudes and frequencies. This approach achieves provably universal approximation, high empirical accuracy on tasks ranging from vision to quantum many-body physics, and a significant reduction in computational cost relative to spline-based KANs and certain dense neural architectures (Reinhardt et al., 2024, Gleyzer et al., 1 Aug 2025, S et al., 3 Mar 2025, Shamim et al., 2 Jun 2025).

1. Theoretical Foundations: Kolmogorov–Arnold Representation

The Kolmogorov–Arnold theorem states that any continuous function f:[0,1]nRf : [0,1]^n \to \mathbb R can be expressed as

f(x1,,xn)=q=12n+1Φq(p=1nψq,p(xp))f(x_1, \ldots, x_n) = \sum_{q=1}^{2n+1} \Phi_q\left( \sum_{p=1}^n \psi_{q, p}(x_p) \right)

where the inner functions ψq,p\psi_{q, p} and the outer functions Φq\Phi_q are univariate and continuous. SineKAN realizes both ψq,p\psi_{q,p} and Φq\Phi_q as finite, learnable expansions in sine bases—each being a weighted sum of sinusoidal functions with possibly learnable frequencies and fixed or learnable phase offsets (Gleyzer et al., 1 Aug 2025, Reinhardt et al., 2024, S et al., 3 Mar 2025). This ensures that, with sufficient expressivity (i.e., large enough sine bases), SineKAN can approximate any continuous multivariate function on a compact domain.

The Sinusoidal Approximation Theorem for Kolmogorov-Arnold Networks formalizes this: For any continuous ff and any ϵ>0\epsilon > 0, there exists a two-layer composition using sums of weighted sinusoids that approximates ff within ϵ\epsilon (Gleyzer et al., 1 Aug 2025).

2. Architecture and Parameterization

SineKAN generalizes KAN layers by replacing B-spline edge activations with learnable grids of sinusoids. In its prototypical form, each edge from neuron f(x1,,xn)=q=12n+1Φq(p=1nψq,p(xp))f(x_1, \ldots, x_n) = \sum_{q=1}^{2n+1} \Phi_q\left( \sum_{p=1}^n \psi_{q, p}(x_p) \right)0 in layer f(x1,,xn)=q=12n+1Φq(p=1nψq,p(xp))f(x_1, \ldots, x_n) = \sum_{q=1}^{2n+1} \Phi_q\left( \sum_{p=1}^n \psi_{q, p}(x_p) \right)1 to neuron f(x1,,xn)=q=12n+1Φq(p=1nψq,p(xp))f(x_1, \ldots, x_n) = \sum_{q=1}^{2n+1} \Phi_q\left( \sum_{p=1}^n \psi_{q, p}(x_p) \right)2 in layer f(x1,,xn)=q=12n+1Φq(p=1nψq,p(xp))f(x_1, \ldots, x_n) = \sum_{q=1}^{2n+1} \Phi_q\left( \sum_{p=1}^n \psi_{q, p}(x_p) \right)3 carries an activation

f(x1,,xn)=q=12n+1Φq(p=1nψq,p(xp))f(x_1, \ldots, x_n) = \sum_{q=1}^{2n+1} \Phi_q\left( \sum_{p=1}^n \psi_{q, p}(x_p) \right)4

where f(x1,,xn)=q=12n+1Φq(p=1nψq,p(xp))f(x_1, \ldots, x_n) = \sum_{q=1}^{2n+1} \Phi_q\left( \sum_{p=1}^n \psi_{q, p}(x_p) \right)5 are the learnable amplitudes and f(x1,,xn)=q=12n+1Φq(p=1nψq,p(xp))f(x_1, \ldots, x_n) = \sum_{q=1}^{2n+1} \Phi_q\left( \sum_{p=1}^n \psi_{q, p}(x_p) \right)6 are phase shifts, typically initialized on a uniform grid covering f(x1,,xn)=q=12n+1Φq(p=1nψq,p(xp))f(x_1, \ldots, x_n) = \sum_{q=1}^{2n+1} \Phi_q\left( \sum_{p=1}^n \psi_{q, p}(x_p) \right)7 (Reinhardt et al., 2024). Nodes sum their incoming edges and apply a learnable or fixed bias: f(x1,,xn)=q=12n+1Φq(p=1nψq,p(xp))f(x_1, \ldots, x_n) = \sum_{q=1}^{2n+1} \Phi_q\left( \sum_{p=1}^n \psi_{q, p}(x_p) \right)8 Stacking several such layers yields a deep SineKAN.

In specialized architectures:

  • Vision Transformers (ViKANformer): The feed-forward block in each Transformer layer replaces MLPs with SineKAN—conducting dimension-wise sine expansions (per input channel), concatenating outputs, and linearly projecting to the output dimension. For input dimension f(x1,,xn)=q=12n+1Φq(p=1nψq,p(xp))f(x_1, \ldots, x_n) = \sum_{q=1}^{2n+1} \Phi_q\left( \sum_{p=1}^n \psi_{q, p}(x_p) \right)9 and sine basis size ψq,p\psi_{q, p}0, the parameter count per layer is ψq,p\psi_{q, p}1 plus optional bias (S et al., 3 Mar 2025).
  • Neural Quantum States: SineKAN is applied as a variational wavefunction ansatz for many-body quantum systems, where spins are passed through several layers of SineKAN with shared sine grids and learnable frequencies to represent quantum states efficiently with manageable parameter counts (Shamim et al., 2 Jun 2025).

3. Training, Optimization, and Hyperparameters

Standard training procedures for SineKAN employ cross-entropy or ψq,p\psi_{q, p}2 error losses, Adam or trust-region least-squares optimizers, and mini-batch stochastic updates. Hyperparameters include the number of sinusoidal basis functions per edge (ψq,p\psi_{q, p}3 or ψq,p\psi_{q, p}4), learning rates (e.g., ψq,p\psi_{q, p}5 to ψq,p\psi_{q, p}6), batch size (128 for MNIST), and initialization schemes—amplitudes drawn from small random distributions and phase grids initialized to evenly cover the domain (Reinhardt et al., 2024, S et al., 3 Mar 2025). No dropout or batch normalization is generally required with stable phase-grid initialization.

In the quantum physics use case, variational Monte Carlo is employed for energy minimization, with parallel Markov chains and problem-specific learning rate scheduling (Shamim et al., 2 Jun 2025).

4. Empirical Performance and Benchmarks

Vision Tasks

On image classification (MNIST), SineKAN consistently matches or outperforms B-spline KANs and other non-spline KAN expansions:

  • For hidden layer sizes 128–512 and 30-epoch training, SineKAN attains accuracy up to 0.9855, outperforming B-spline KAN at high widths (Reinhardt et al., 2024).
  • Depth scaling shows continued improvements as the number of SineKAN layers increases, which is not observed for B-spline KANs.
  • In ViKANformer, SineKAN achieves 97.8% test accuracy, F1 score 0.9789, and ROC AUC 0.9996 in 9 minutes per epoch. This slightly trails vanilla KAN but outperforms FourierKAN and matches Fast-KAN at substantially lower training overhead (S et al., 3 Mar 2025).

Quantum Many-Body Physics

  • For Neural Quantum States representing large 1D spin chains (up to ψq,p\psi_{q, p}7), SineKAN matches or exceeds the accuracy of restricted Boltzmann machines (RBMs), LSTM-based models, and standard MLPs, while closely approaching Density Matrix Renormalization Group (DMRG) results (Shamim et al., 2 Jun 2025).
  • With ψq,p\psi_{q, p}8 parameters (e.g., 86,433 for ψq,p\psi_{q, p}9), SineKAN achieves ground-state energy relative errors at the Φq\Phi_q0 level and can represent complex sign and amplitude structures.
  • Reflection-symmetric variants further improve fidelity in near-degenerate or critical regimes.

Analytical Function Approximation

  • On smooth, oscillatory, and singular function interpolation tasks, SineKAN outperforms classical Fourier series and matches or betters sine-activated MLPs, often achieving 1–2 orders of magnitude lower error per parameter (Gleyzer et al., 1 Aug 2025).
Model/Variant MNIST Accuracies (h=128) Speed-up vs. B-SplineKAN Params (L=100 QM)
SineKAN 0.9831–0.9855 4–9× 86,433
B-SplineKAN 0.9835 1× (baseline)
RBM (quantum) 1.29M
LSTM (quantum) 83,240

5. Computational Cost, Scaling, and Trade-offs

SineKAN layers, by leveraging the Kolmogorov–Arnold structure and using adaptive sine bases, concentrate representational power into a small number of univariate (frequency) channels. Parameter counts scale as Φq\Phi_q1 for Φq\Phi_q2 input dimensions, or as dictated by the Kolmogorov theorem (Φq\Phi_q3 outer nodes, Φq\Phi_q4 inner nodes). Empirically:

  • SineKAN achieves 4–9× faster inference than optimized B-spline KANs at equal basis size, as each edge’s B-spline lookup is replaced by efficient sinusoidal evaluation (Reinhardt et al., 2024).
  • In ViKANformer, SineKAN incurs 5–8× greater GPU time per epoch than an MLP block; inference latency per layer is proportional to Φq\Phi_q5 but remains practical for modest Φq\Phi_q6 (S et al., 3 Mar 2025).
  • Training remains tractable up to very large quantum systems owing to parameter and memory efficiency.

Potential optimizations include GPU-custom kernels for sine evaluations, pruning of frequency channels, and the application of SineKAN only to select input dimensions or via hybrid architectures (S et al., 3 Mar 2025).

6. Advantages, Limitations, and Extension Directions

Advantages:

  • Proven universal approximation and adaptivity for arbitrary continuous mappings via the sinusoidal Kolmogorov–Arnold structure (Gleyzer et al., 1 Aug 2025).
  • Superior empirical accuracy per parameter on vision, function approximation, and quantum NQS tasks.
  • Significant inference speedups and training tractability compared to spline KANs and (in selected regimes) dense MLPs.

Limitations:

  • Interpretability lags behind spline-based KANs, which offer sparsifiability and partial symbolic understanding.
  • Specific initialization of phase grids is critical; misconfiguration may slow or destabilize training (“R(g) scaling law”).
  • Computational cost per layer, though efficient versus spline-based KANs, exceeds that of standard ReLU or affine activations by a small constant due to trigonometric evaluation.

Future Directions:

  • Incorporation of learnable frequencies and mixed sine/spline bases to further improve representational flexibility (Reinhardt et al., 2024).
  • Integration with residual attention or Transformer-style skip connections to enable very deep, stable SineKANs.
  • Application in complex-valued settings (e.g., directly learning sign structures in frustrated quantum matter) and in architectures enforcing additional symmetries or invariances (Shamim et al., 2 Jun 2025).
  • Scaling to larger-scale vision tasks, structured predictions, and physical systems beyond one-dimensional quantum chains.

7. Context in the Neural Network Landscape

SineKAN occupies a unique position between traditional MLPs (universality via dense, layered compositions and fixed nonlinearities), KANs with spline bases (interpretability, universality but higher compute cost), and periodic/frequency-based methods such as Fourier transform approximations. SineKAN’s adaptive-frequency sinusoidal units provide a principled, theoretically complete, and computationally viable path to multivariate function representation, with empirical results validating its capabilities and highlighting clear trade-offs. The reflection-symmetric SineKAN (“rSineKAN”), in particular, excels in tasks requiring symmetry-adapted representations, as in quantum spin systems (Shamim et al., 2 Jun 2025). Further exploration of SineKAN within hybrid deep architectures is an active and promising direction.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to SineKAN.