Papers
Topics
Authors
Recent
Search
2000 character limit reached

Sinusoidal Representation Networks (SIRENs)

Updated 29 March 2026
  • SIRENs are implicit neural representations that use sinusoidal activations to model continuous signals such as images, audio, and 3D shapes.
  • They achieve high-frequency, smooth representations via layered sine functions with careful initialization and frequency control.
  • Their analytic derivatives and adaptable spectral properties make them effective for vision, graphics, medical imaging, and physics-informed PDEs.

Sinusoidal Representation Networks (SIRENs) are a class of implicit neural representations (INRs) that model continuous signals such as images, audio, 3D shapes, and PDE solutions using multilayer perceptrons (MLPs) endowed with sinusoidal (sine) activations. SIRENs are distinguished by their capacity to faithfully capture fine-scale, high-frequency components, and to analytically represent arbitrary-order derivatives, properties essential in domains like computer vision, graphics, and scientific computing (Sitzmann et al., 2020).

1. Architecture and Initialization Principles

A SIREN is defined as a fully connected neural network in which every nonlinearity is a sine function. For an input xRd\mathbf{x} \in \mathbb{R}^d, a typical SIREN with LL hidden layers of fixed width NN has the recursive structure: h0=x h1=sin(ω0W0h0+b0) h+1=sin(Wh+b)(=1,,L1) fθ(x)=WLhL+bL.\begin{aligned} \mathbf{h}_0 &= \mathbf{x} \ \mathbf{h}_1 &= \sin(\omega_0 W_0 \mathbf{h}_0 + \mathbf{b}_0) \ \mathbf{h}_{\ell+1} &= \sin(W_\ell \mathbf{h}_\ell + \mathbf{b}_\ell)\quad (\ell = 1,\dots,L-1) \ f_\theta(\mathbf{x}) &= W_L \mathbf{h}_L + \mathbf{b}_L. \end{aligned} The scaling constant ω0\omega_0 controls the input layer’s frequency, typically fixed at 30 for vision tasks (Sitzmann et al., 2020Origer et al., 2024Miotto et al., 29 Jan 2025). Subsequent layers may use ω=1\omega_\ell = 1, or be omitted, depending on the task (Conde et al., 2024).

SIREN initialization is crucial: the original prescription sets first-layer weights to

W0U[1nin,1nin]W_0 \sim \mathcal{U}\left[-\frac{1}{n_\text{in}},\frac{1}{n_\text{in}}\right]

and higher-layer weights as

WU[6n/ω0,6n/ω0],W_\ell \sim \mathcal{U}\left[-\sqrt{\frac{6}{n_\ell}}/\omega_0,\,\sqrt{\frac{6}{n_\ell}}/\omega_0\right],

with all biases zeroed, ensuring that pre-activations have O(1)\mathcal{O}(1) variance so gradients neither vanish nor explode, enabling stable and deep training (Sitzmann et al., 2020Gao et al., 2024Conde et al., 2024Origer et al., 2024Miotto et al., 29 Jan 2025). Proper input scaling is often used so all coordinates fall in [1,1][-1,1] (Origer et al., 2024).

Recent theoretical advances refine these strategies by ensuring Jacobian variance control and pre-activation scaling to precisely regulate both training dynamics and frequency support (Combette et al., 6 Dec 2025).

2. Spectral Properties and Functional Expressivity

The defining functional property of SIRENs is their ability to represent signals as compositions of affine transformations and periodic nonlinearities, yielding a basis analogous to a parameterized Fourier series. Formally, the composition of affine maps and sin()\sin(\cdot) in each layer induces an expansion in which the network computes a sum of harmonics whose frequencies are integer linear combinations of input frequencies and layer weights (Novello, 2022Novello et al., 2024): h(x)=sin(i=1naisin(ωix+φi)+b)=kZnαk(a)sin(k(ωx+φ)+b)h(x) = \sin\left(\sum_{i=1}^n a_i \sin(\omega_i x + \varphi_i) + b\right) = \sum_{k\in\mathbb Z^n} \alpha_k(a) \sin(k\cdot(\omega x + \varphi) + b) with αk(a)\alpha_k(a) analytically determined by Bessel functions of the weights (Novello, 2022). This expansion reveals that every SIREN layer increases the richness of the spectral dictionary: the first layer sets a basis of “input frequencies,” and higher layers combine these into higher-order composite frequencies.

A theoretical upper bound for the amplitude of each harmonic component is

αk(a)<i=1n(ai2)kiki!,|\alpha_k(a)| < \prod_{i=1}^n \frac{\left(\frac{|a_i|}{2}\right)^{|k_i|}}{|k_i|!},

providing spectral bias toward low frequencies, unless weights are large (Novello, 2022). SIRENs are therefore smooth and infinitely differentiable (CC^\infty), and their spectrum can cover arbitrarily high frequencies, as governed by network width, initialization, and task-specific frequency scaling.

3. Spectral Bias, Bottlenecks, and Frequency Control

Despite their capacity, SIRENs display a pronounced spectral bias: during training, low-frequency modes are fit before high frequencies. This bias arises both from initialization and the optimization dynamics of gradient descent (Gao et al., 2024Novello et al., 2024Chandravamsi et al., 16 Sep 2025).

A critical phenomenon is the “spectral bottleneck,” where, if the initial network spectrum does not cover frequencies present in the target, the model can collapse to near-zero output, failing to fit even in-band content (Chandravamsi et al., 16 Sep 2025). To mitigate this, initialization schemes such as WINNER—weight initialization with adaptive noise governed by the target’s spectral centroid—have been proposed. This approach perturbs the first two layers’ weights by Gaussian noise with scale set by a function of the signal’s frequency centroid, leading to significant improvements in PSNR for audio, images, and SDF tasks without adding trainable parameters (Chandravamsi et al., 16 Sep 2025).

Bandlimit and frequency coverage can also be improved via architectural augmentations: H-SIREN extends the standard SIREN by replacing the first layer’s sine with sin(ω0sinh(2x))\sin(\omega_0 \sinh(2x)), thus injecting an infinite spectrum of harmonics and enabling strong high-frequency representation while retaining SIREN’s beneficial low-mode bias (Gao et al., 2024). Empirically, H-SIREN achieves PSNR and SSIM gains exceeding +10+10 dB and +0.15+0.15, respectively, over vanilla SIREN in 2D image fitting, and consistent improvements in video, NeRF, and graph-based physics applications (Gao et al., 2024).

Bandlimited SIRENs can also be constructed by “freezing” frequency bases in the first layer (SASNet), spatially masking high-frequency contribution by location to suppress overfitting in smooth regions, and enabling high-fidelity INR fitting with robust hyperparameter performance (Feng et al., 12 Mar 2025).

4. Training, Optimization, and Stability

SIRENs are trained to minimize a sample- or domain-appropriate loss, such as mean squared error for image or audio fitting, Poisson log-likelihood for PET reconstruction, or physics-informed residuals for PDEs. Regularization is rarely needed, as the inductive bias imposed by the architecture controls smoothness and frequency content (Sitzmann et al., 2020Miotto et al., 29 Jan 2025Moussaoui et al., 26 Mar 2025). Standard optimizers like Adam or L-BFGS are effective (Moussaoui et al., 26 Mar 2025Miotto et al., 29 Jan 2025).

Training stability depends sensitively on bandwidth hyperparameters, layer width, and initialization. Narrow SIRENs are highly sensitive to random seed, showing PSNR variance scaling as 1/w1/\sqrt{w} due to poor sampling of the frequency basis in the first layer (Vonderfecht et al., 2024). Empirically, a substantial fraction of final error variance is attributable to first-layer randomness, and meta-learning or freezing this layer can halve encoding variability.

To ensure depth stability, recent initialization schemes fix the variance of the layerwise Jacobian (the “edge-of-chaos” regime), guaranteeing that gradients neither vanish nor explode with depth and that spurious high modes are not over-amplified in very deep architectures (Combette et al., 6 Dec 2025). For deep SIRENs, this approach yields linear scaling of NTK trace and bounded conditioning, resulting in fast yet controlled training dynamics.

5. Applications in Vision, Physics, and Signal Processing

SIRENs are applied wherever continuous, differentiable signal representations are required:

  • Vision and graphics: Fitting 2D images, video, 3D shapes, and radiance fields, SIRENs outperform ReLU/tanh-MLPs and Fourier-feature networks at capturing high-frequency variations, edges, and per-pixel detail (Sitzmann et al., 2020Conde et al., 2024). In neural radiance fields (NeRFs), direct SIREN parameterizations obviate external positional encoding at moderate compression factors (Gao et al., 2024).
  • Image compression: SIRENs can reconstruct 512×512 images with PSNR ~31 dB and SSIM ~0.85 at 2.4× compression. At low bitrates, SIRENs rival JPEG2000, but achieving robustness against noise, packet loss, and parameter pruning is challenging without redundancy (Conde et al., 2024Vonderfecht et al., 2024).
  • Medical imaging and inverse problems: SIRENs parameterize PET activity maps, solving continuous inverse problems with positivity constraints, outperforming both classical penalized-likelihood and deep image prior reconstructions in contrast, bias, and edge preservation (Moussaoui et al., 26 Mar 2025). For pressure reconstruction from velocimetry, SIRENs enable mesh-free, noise-tunable, differentiable solutions, outperforming both matrix-integration and Green’s-function approaches, especially on unstructured domains (Miotto et al., 29 Jan 2025).
  • Physics-informed neural networks (PINNs): The analytic derivatives available in SIRENs allow direct enforcement of differential constraints for PDEs—e.g., Navier–Stokes, Poisson, Helmholtz, Eikonal—yielding superior accuracy compared to standard MLPs and enabling regularity diagnostics via residual error concentration and Gibbs localization (Sitzmann et al., 2020Burton, 18 Mar 2026).
  • EEG–fMRI translation and control systems: SIRENs are integrated as feature extractors for multi-channel EEG, outperforming state-of-the-art neural architectures in reconstructing fMRI signals (Li et al., 2023). For guidance and control, replacing ReLU/Softplus by sine activations in G&C networks accelerates training and provides more accurate trajectory tracking and optimal policy learning for drone and spacecraft tasks (Origer et al., 2024).

6. Limitations, Extensions, and Future Directions

SIRENs exhibit several documented limitations:

  • Spectral bias: Intrinsic preference for low frequencies may hinder representation of sharp discontinuities or very high-frequency structure unless initialization is appropriately broadened (Gao et al., 2024Chandravamsi et al., 16 Sep 2025).
  • Sensitivity to initialization and hyperparameters: Improper frequency scaling or seed dependence in narrow networks can lead to high error variance, slow convergence, and performance unpredictability (Vonderfecht et al., 2024).
  • Overfitting and robustness: Out-of-band noise or architecture-induced high-frequency artifacts can arise in overparameterized settings, especially in the absence of frequency or mask controls (Feng et al., 12 Mar 2025Conde et al., 2024).

Augmentations to address these issues include H-SIREN’s hyperbolic activation (Gao et al., 2024), SASNet’s fixed frequency bases and spatial masks (Feng et al., 12 Mar 2025), TUNER’s spectral sampling and bounding (Novello et al., 2024), and WINNER’s adaptive-noise initializations (Chandravamsi et al., 16 Sep 2025).

Advances in “edge-of-chaos” initialization provide a principled approach to controlling gradient propagation and frequency leakage in deep networks (Combette et al., 6 Dec 2025). Managing the tradeoff between representable bandwidth and network depth, width, and training stability remains an active research area.

A plausible implication is that further theoretical analysis connecting SIREN’s implicit kernels, spectral properties, and physical boundary conditions may yield even broader utility in simulation, signal recovery, and structured data compression.

7. Summary Table: Core Properties and Mechanisms

Aspect SIREN (Original) Recent Extensions
Activation sin(ω0z)\sin(\omega_0 * z) H-SIREN: sin(ω0sinh(rz))\sin(\omega_0 \sinh(rz)) (Gao et al., 2024)
Initialization Layer-wise uniform scaling for pre-activation variance WINNER noise (Chandravamsi et al., 16 Sep 2025), Edge-of-chaos (Combette et al., 6 Dec 2025)
Frequency support Fixed by ω0\omega_0 at input, spectral bias to low-f Broadened by H-SIREN, WINNER, SASNet (Gao et al., 2024, Feng et al., 12 Mar 2025, Chandravamsi et al., 16 Sep 2025)
Analytic derivatives All orders, closed form Retained in extensions
Spectrum control Manual via ω0\omega_0, width, depth Spectral sampling, bounding, masking (Novello et al., 2024, Feng et al., 12 Mar 2025)
Representative use cases Images, SDFs, NeRF, PDE PINNs, compression PET imaging, pressure recon, control, EEG-fMRI (Moussaoui et al., 26 Mar 2025, Miotto et al., 29 Jan 2025, Origer et al., 2024, Li et al., 2023)

All these properties derive directly from the technical data of the cited works. SIRENs are thus a foundational family of coordinate-based neural networks, with a rapidly expanding body of theory, initialization practice, and cross-domain applications.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Sinusoidal Representation Networks (SIRENs).