CSI-INR: Implicit Neural Representations

Updated 3 March 2026

CSI-INR is a continuous, coordinate-based neural network framework that models signals as functions mapping coordinates to values.
The approach leverages specialized activations (e.g., SIREN) and Fourier-based positional encodings to overcome spectral bias and capture high-frequency details.
CSI-INR enables practical advances in image super-resolution, 3D modeling, and wireless communications through efficient compression, meta-learning, and robust reconstruction.

Implicit neural representations (INRs) are coordinate-based neural networks—typically multilayer perceptrons (MLPs)—that model signals such as images, audio, or 3D shapes as continuous functions, mapping input coordinates directly to signal values. The "CSI-INR" paradigm refers, both historically and by notational analogy, to coordinate-based or continuous signal implicit representations, which stand in contrast to conventional discrete, grid-based approaches. INRs have emerged as a foundational tool for high-fidelity signal representation, efficient compression, inverse problem solving, and meta-learning, especially across computer vision, graphics, wireless communications, and computational imaging.

1. Mathematical Foundations and Basic Formulation

An implicit neural representation of a signal is defined by a neural network function $f_\theta$ , parameterized by weights $\theta$ , such that

$f_\theta : \mathbb{R}^d \to \mathbb{R}^c.$

For instance, a color image is modeled as $f_\theta: \mathbb{R}^2 \to \mathbb{R}^3$ , mapping a spatial coordinate $x = (u, v)$ to an RGB value. The standard layerwise transformation is

$z^{(0)} = x,\quad z^{(i)} = \sigma(W_i z^{(i-1)} + b_i), \quad (i=1, \ldots, L-1),\quad f_\theta(x) = W_L z^{(L-1)} + b_L,$

where $\sigma$ is a nonlinear activation function. INRs are full-field continuous: they can be sampled at arbitrary resolution or queried off-grid without explicit discretization (Essakine et al., 2024, Xu et al., 2023).

The canonical INR forms the basis of NeRF-style 3D scene modeling, coordinate-based superresolution, denoising, inpainting, CT reconstruction, and MIMO channel feedback (Xu et al., 2023, Essakine et al., 2024, Wu et al., 2024).

2. Core Activation Functions and Spectral Properties

A critical aspect of INR design is the choice of nonlinear activation function, as it determines the spectral bias and capacity to represent fine-scale details. Conventional (ReLU-based) MLPs suffer from a "spectral bias" toward low-frequency fitting, limiting their expressiveness for high-frequency content (Essakine et al., 2024, Benbarka et al., 2021). Several classes of activations have emerged:

Sinusoidal (SIREN): $\sigma(x) = \sin(\omega_0 x)$ , controlling frequency content explicitly via $\omega_0$ . SIREN MLPs are essential for high-frequency implicit modeling (Xu et al., 2023, Essakine et al., 2024).
Fourier-Gabor-Wavelet: $\sigma(x) = \exp[-(s x)^2] \sin(\omega x)$ (Gabor), offering frequency-spatial localization and the ability to represent local and global features (Essakine et al., 2024, Roddenberry et al., 2023).
Complex wavelets: First-layer wavelet templates followed by analytic nonlinearities enable INR architectures to simultaneously capture singularities (edges) and band-pass structure, with improved high-frequency fidelity and convergence speed versus sinusoidal-only or piecewise-linear MLPs (Roddenberry et al., 2023).
Learnable dynamic or harmoniser-based activations (Incode, FINER, HOSC): Essentially combining adaptive amplitude, frequency, phase, or kernel width modulations for layer- or neuron-specific spectral tuning (Essakine et al., 2024).

Ablation results show the centrality of sinusoidal activations: for image super-resolution, SIREN yields $\theta$ 0 dB (PSNR), while ReLU+PE gives $\theta$ 1 dB, $\theta$ 2 $\theta$ 3 dB, and sigmoid $\theta$ 4 dB (Xu et al., 2023).

3. Positional Encodings, Fourier Mappings, and Progressive Frequency Schemes

To further break the low-frequency bias and enable sharp reconstructions, positional encoding schemes are widely employed:

Fixed Fourier features: $\theta$ 5, with $\theta$ 6 either fixed integer-lattice frequencies or sampled from a Gaussian. These mappings allow even shallow MLPs to access higher spectral content (Benbarka et al., 2021, Essakine et al., 2024).
Integer lattice mapping: Directly constructs the full $\theta$ 7-dimensional truncated Fourier series, yielding a basis with controlled bandwidth (Nyquist scaling) and robust mathematical properties, including exact signal recovery from suitable measurements (Benbarka et al., 2021).
Progressive training: Gradually unmasking higher frequency components in Fourier-mapped feature sets during training avoids overfitting and enforces statistical smoothness (Benbarka et al., 2021).
Input scaling (Kernel transformation): Scaling coordinates to, e.g., $\theta$ 8 for SIREN matches the optimal frequency range for the MLP backend, as ablation studies demonstrate major PSNR gains via simple input normalization (Zheng et al., 7 Apr 2025).

Table: Fourier/Encoding Approaches and Properties

Encoding	Bandwidth Control	Spatial Localization	Recommended Architecture
Basic Fourier	explicit	none	Shallow/MLP+sinusoidal
Integer lattice PE	explicit	none	Linear on PE (Fourier series)
Gabor/Wavelet	yes	yes	Split or composite MLP
Input Scaling	aligned	none	Any backend (SIREN, FINER)

4. Kernel Transformations and Network Structure Optimizations

Recent work systematically studies the effect of kernel-level linear transformations—specifically input scaling and output shifting (the "Scale-and-Shift" or "SS-INR" module). These trivial modifications:

Add negligible computational overhead (only two element-wise operations per forward pass).
Do not increase parameter count ( $\theta$ 9 scalars for $f_\theta : \mathbb{R}^d \to \mathbb{R}^c.$ 0-channel outputs).
Can outperform the effect of adding 2 full MLP layers in terms of PSNR, with a $f_\theta : \mathbb{R}^d \to \mathbb{R}^c.$ 1 reduction in model size (Zheng et al., 7 Apr 2025).

Empirically, scaling input coordinates (e.g. $f_\theta : \mathbb{R}^d \to \mathbb{R}^c.$ 2, with $f_\theta : \mathbb{R}^d \to \mathbb{R}^c.$ 3 for SIREN) and output centering ( $f_\theta : \mathbb{R}^d \to \mathbb{R}^c.$ 4, with $f_\theta : \mathbb{R}^d \to \mathbb{R}^c.$ 5) elevates PSNR by $f_\theta : \mathbb{R}^d \to \mathbb{R}^c.$ 6– $f_\theta : \mathbb{R}^d \to \mathbb{R}^c.$ 7 dB and SSIM by $f_\theta : \mathbb{R}^d \to \mathbb{R}^c.$ 8– $f_\theta : \mathbb{R}^d \to \mathbb{R}^c.$ 9 across tasks and backbones. The effect is jointly interpretable as implicit depth increase (extra linear layers) and improved conditioning via normalization.

Additionally, combined strategies—such as multiplicative filter networks, conditional harmonizers, and Fourier reparameterization of weights—have been shown to further advance resolution, adaptability, and reconstruction quality (Essakine et al., 2024).

5. Learning, Meta-Learning, and Data-Driven Compression

Optimization-based INR fitting: Training network parameters from scratch to fit a given signal using observation-specific losses. Early stopping is vital in denoising tasks to prevent overfitting noise (Xu et al., 2023).
Meta-learning paradigms: Instead of per-instance retraining, meta-learned methods (e.g. MAML) learn a base initialization, then adapt via a few gradient steps through adaptation vectors or modulation codewords (Wu et al., 2024).
Transformer hypernetworks: Recasting the entire INR generative process as a set-to-set mapping, Transformers can predict entire INR weight sets from input (patch-wise) observations, alleviating single-vector bottlenecks common in hypernetwork approaches (Chen et al., 2022). Such approaches outperform gradient-based meta-initialization (e.g., PSNR up to $f_\theta: \mathbb{R}^2 \to \mathbb{R}^3$ 0– $f_\theta: \mathbb{R}^2 \to \mathbb{R}^3$ 1 dB higher), especially for sparse or high-dimensional tasks, as each weight token can attend to all observation tokens.
Compression strategies: Sparsity-driven compressed implicit neural representations (SINR) encode the trained INR model weights using random high-dimensional overcomplete dictionaries, enabling entropy-coded, dictionary-free storage. SINR yields $f_\theta: \mathbb{R}^2 \to \mathbb{R}^3$ 2– $f_\theta: \mathbb{R}^2 \to \mathbb{R}^3$ 3 storage reductions over quantization/entropy-coding baselines with $f_\theta: \mathbb{R}^2 \to \mathbb{R}^3$ 4 dB PSNR drop and generalizes across modalities (images, NeRF, occupancy) (Jayasundara et al., 25 Mar 2025).

6. Theory: Sample Complexity, Representation, and Signal Recovery

A rigorous sampling theory for CSI-INRs has emerged, especially in linear inverse problem settings:

For ReLU+Fourier feature INRs, the training problem under weight decay regularization admits a convex measure-theoretic reformulation, where every finite-width network corresponds to a sum of Dirac measures on the weight-sphere (Najaf et al., 2024).
It is proven that a width-1 INR target (a rectified trigonometric polynomial) can be exactly recovered from as few as $f_\theta: \mathbb{R}^2 \to \mathbb{R}^3$ 5 low-pass Fourier samples if the measurement bandwidth $f_\theta: \mathbb{R}^2 \to \mathbb{R}^3$ 6 (with $f_\theta: \mathbb{R}^2 \to \mathbb{R}^3$ 7 denoting the highest frequency in design). The requirement scales linearly with INR width $f_\theta: \mathbb{R}^2 \to \mathbb{R}^3$ 8; i.e., for $f_\theta: \mathbb{R}^2 \to \mathbb{R}^3$ 9 “atoms,” $x = (u, v)$ 0 suffices (Najaf et al., 2024).
Empirical results confirm a phase transition for exact recovery along the predicted measurement-count threshold.
In practical superresolution tasks, such regularized INRs drastically reduce Gibbs ringing and overfitting compared to naive IFFT, achieving state-of-the-art reconstructions at optimal sample complexity (Najaf et al., 2024).

7. Applications and Practical Recommendations

Low-Level Vision: CSI-INRs achieve state-of-the-art in zero-shot denoising, super-resolution, inpainting, and deblurring, outperforming DIP, Self2Self, and ZSSR by $x = (u, v)$ 1– $x = (u, v)$ 2 dB PSNR with lower resource consumption (Xu et al., 2023).
3D Shape Analysis: INR-based frameworks (e.g. inr2vec) yield compact latent codes for 3D shape retrieval, classification, and segmentation, offering unified, resolution-independent representations (Luigi et al., 2023).
Wireless Communications: In massive MIMO systems, implicit channel representations with meta-learned modulations achieve extreme CSI compression while maintaining (or exceeding) previous SOTA NMSE/distortion (Wu et al., 2024).
High-Order Inverse Problems: Integration of physics-based constraints (e.g., PDEs, known forward operators) is straightforward due to full differentiability and functional smoothness, critical for computational imaging and scientific ML (Essakine et al., 2024).
Design guidelines:
- For high-frequency detail, employ SIREN or progressive complex wavelet templates with appropriate input scaling and output centering.
- Use progressive frequency unmasking for stable, smooth training in high-resolution settings (Benbarka et al., 2021, Roddenberry et al., 2023).
- For compression, consider sparse codes over random dictionaries (no dictionary transmission needed), quantized and entropy-coded for further reduction (Jayasundara et al., 25 Mar 2025).

8. Open Directions and Limitations

Key challenges for CSI-INR research include:

Scalability and memory: Handling ultra-high resolution (e.g., megapixel or 3D volume) signals while maintaining feasible memory and time costs, especially for deep transformer hypernetworks (Chen et al., 2022, Essakine et al., 2024).
Expressive activation and encoding design: There is a need for new nonlinearities and positional encodings with adaptive, data-dependent spectral profiles, and for frameworks combining the locality of wavelets with the universality of trigonometric and polynomial bases (Essakine et al., 2024, Roddenberry et al., 2023).
Efficient meta-learning and generalized adaptation: Quadratic token scaling in transformer-based meta-learners, and the inability to scale to very deep or high-dimensional neural functions, remain challenging (Chen et al., 2022).
Theoretical generalization: While sample complexity is increasingly well-understood for shallow, linear and ReLU INRs, nontrivial open questions remain for deep or compositional MLPs and multimodal signals (Najaf et al., 2024).
Unified frameworks: Bridging implicit representations with explicit (voxel/mesh/grid) structures for hybrid or multi-scale learning, and integrating advanced physics constraints for scientific applications, are ongoing research frontiers (Essakine et al., 2024).