Implicit Neural Representations (INRs)

Updated 13 September 2025

Implicit Neural Representations are neural network models that encode signals as continuous functions mapping spatial coordinates to signal values, ensuring smooth and resolution-independent outputs.
They employ advanced methods such as Fourier feature positional encoding, hybrid architectures, and neural tangent kernels to capture intricate high-frequency details and promote efficient learning.
Applications in 3D reconstruction, image compression, scientific imaging, and signal processing showcase the practical benefits of INRs in achieving high fidelity and interpretability.

Implicit Neural Representations (INRs) are neural network-based models that encode signals—such as images, audio, shapes, or volumes—as continuous functions mapping spatial (and possibly temporal) coordinates to signal values. By parameterizing data implicitly, INRs unify the representation of diverse media within a coordinate-to-value paradigm, providing intrinsic resolution-independence, smoothness, and differentiability. The INR framework has seen rapid development, encompassing both mathematical foundations and a wide range of applications, from geometric modeling and signal compression to scientific computing and interpretability.

1. Mathematical Formulation and Structural Principles

The canonical INR is a multilayer perceptron (MLP) that realizes a function

$f_\theta : \mathbb{R}^d \to \mathbb{R}^k,$

where $\theta$ denotes the network parameters, $d$ is the coordinate dimension (e.g., 2D image, 3D space), and $k$ the output channel count (e.g., RGB, scalar field). The network processes the input coordinate vector (optionally transformed by a positional encoding, such as Fourier features) through a series of learned affine transformations and nonlinear activations. For instance, in SIREN architectures, each layer employs a sinusoidal activation: $x_{l+1} = \sin(\omega_l W_l x_l + b_l),$ where $\omega_l$ modulates the frequency.

Recent advanced architectures extend this formulation through:

Hybrid Convolutional-MLP blocks: Combining convolutional inductive biases for localized information processing with coordinate MLPs for expressiveness (Song et al., 2022).
Mixture of Experts (MoE): Employing multiple expert MLPs, each specializing in a subset of the domain and selected via a manager network to yield piecewise-continuous function representations (Ben-Shabat et al., 29 Oct 2024).
Iterative Refinement: Incorporating auxiliary correction networks and fusion layers for progressive detail recovery and noise robustness (Haider et al., 24 Apr 2025).

2. Expressiveness: Frequency Support, Dictionary Perspective, and NTK

INRs differ fundamentally from discrete representations in their spectral expressiveness. A key insight is the equivalence of deep INRs to “structured signal dictionaries.” The atomization of representation arises as follows:

Integer Harmonics: A base set of input mapping frequencies is expanded via polynomial (or analytic) activations into a dictionary whose atoms are sinusoidal components, each corresponding to an integer linear combination of bases. The depth $L$ and activation polynomial degree $K$ control the exponential frequency support as $K^{L-1}$ (Yüce et al., 2021).
Neural Tangent Kernel (NTK): The eigenfunctions of the empirical NTK at initialization act as dictionary atoms. The alignment between target signal energy and NTK eigenfunctions with large eigenvalues determines sample efficiency and learnability; meta-learning shifts this alignment for accelerated adaptation (Yüce et al., 2021). Mathematically: $f(r) = \sum_{\omega' \in \mathcal{J}(\Omega)} c_{\omega'} \sin(\langle \omega', r\rangle + \phi_{\omega'}),$ where

$\mathcal{J}(\Omega) \subseteq \left\{\sum_t s_t \Omega_t: s_t \in \mathbb{Z}, \sum_t |s_t| \leq K^{L-1}\right\}.$

3. Optimization: Loss Function Engineering, Regularization, and Initialization

Traditional INR training relies on standard regression losses, but recent work introduces domain-theoretic priors via variational losses and specialized regularization:

Phase Transition–Inspired Losses: Surface reconstruction can be posed using an energy functional analogous to phase separation, constructed as

$F_\varepsilon(u) = \lambda \mathcal{L}(u) + \int_\Omega [\varepsilon |\nabla u|^2 + \frac{1}{\varepsilon}W(u)],$

with $W(u)$ a double-well potential regularizing $u$ towards binary occupancy, $\mathcal{L}(u)$ a local reconstruction loss enforcing that the zero level set passes near input data, and a gradient term inducing spatial regularity. The asymptotic limit as $\varepsilon \to 0$ provably yields minimal-perimeter surfaces under strict reconstruction constraints, thus providing a strong geometric inductive bias (Lipman, 2021).

Generalized Weight Decay in Sampled Recovery: For linear inverse problems and sampling theory, the regularization is formulated via a positively 1-homogeneous function $\eta(w)$ , promoting sparsity in the function-space representation, connected to convex measure recovery (Najaf et al., 28 May 2024).
Kernel Transformation for Conditioning: Linear input and output transformations—scaling coordinates and shifting outputs—dramatically improve network trainability and expressiveness, mimicking the benefit of increased depth or normalization (Zheng et al., 7 Apr 2025).
Variance Informed Initialization: A principled initialization that stabilizes both forward activation and backward gradient variances for arbitrary activation functions and deep MLPs, essential for high-frequency and non-saturated activations (Koneputugodage et al., 27 Apr 2025).

4. Applications and Empirical Performance

INRs have demonstrated superior, often state-of-the-art, performance in a range of domains:

Surface and 3D Shape Reconstruction: Representing surfaces as network zero level sets, achieving minimal-perimeter interfaces and reducing artifact surfaces compared to occupancy or distance-only methods (Lipman, 2021).
Image and Signal Compression: INR-based compression pipelines, integrating quantization-aware retraining and entropy coding, have outperformed classic codecs (e.g., JPEG, JPEG2000) under proper meta-learned initialization schemes, although computational bottlenecks and instabilities remain key limitations (Strümpler et al., 2021, Conde et al., 25 Sep 2024). Approaches such as SINR exploit compressibility of INR weights directly, leveraging sparsity and high-dimensional dictionary coding for aggressive bitrate reduction without quality loss (Jayasundara et al., 25 Mar 2025).
Scientific and Medical Imaging: The continuous nature of INR is leveraged for medical imaging inverse problems (CT/MRI/PET reconstructions), segmentation, and registration. The SIREN architecture (sine activations) improves high-frequency detail preservation in PET, yielding contrast and bias advantages over conventional and deep image prior methods (Molaei et al., 2023, Moussaoui et al., 26 Mar 2025).
Signal Processing: Manipulation of INRs via analytically tractable differential operators (e.g., INSP-Net) allows low- and high-level signal processing directly in the embedded space, bypassing discretization bottlenecks (Xu et al., 2022).
Explanability and Interpretability: INR frameworks can produce robust and smooth attribution masks for explainable AI by conditioning on both spatial and area parameters and iterating to find non-overlapping masks, enabling continuous, parametric exploration of model reliance on image regions (Byra et al., 20 Jan 2025).
Mechanics and PDE Solvers: By representing signed distance functions as INRs, complex geometries can be integrated into finite element simulations via methods such as the Shifted Boundary Method (SBM) without explicit meshing, achieving high-fidelity and computational efficiency in simulations involving engineering and physical sciences (Karki et al., 3 Jul 2025).

5. Limitations and Computational Challenges

Despite notable successes, INRs confront several fundamental and practical limitations:

Encoding and Training Complexity: Training an INR per-sample (e.g., for image-specific compression) is computationally intensive, conferring scalability and latency constraints in online or high-throughput settings (Conde et al., 25 Sep 2024).
Stability and Robustness: Performance variability due to hyperparameter sensitivity, initialization, and lack of robustness to weight or neuron loss are documented, with some methods suffering from high performance variance across runs (Conde et al., 25 Sep 2024). Mixture-of-Experts architectures, if not properly conditioned, can face expert collapse or poor domain coverage (Ben-Shabat et al., 29 Oct 2024).
Spectral and Locality Trade-offs: Some architectures—particularly those relying solely on global frequency expansion—exhibit locality bias, impeding fine-detail capture absent explicit spatial localization (which methods such as Gaussian or windowed activations, or dictionary learning, partly address) (Essakine et al., 6 Nov 2024).
Generalization Across Modalities: Domain-specific inductive biases (e.g., for images vs. audio) require tailored architectures and loss design to capture high-frequency details and temporal dependencies effectively (Szatkowski et al., 2023).

6. Future Research Directions and Open Problems

Several promising research trajectories have gained attention:

Expressive Activation and Encoding: Expanding the repertoire of activation and kernel functions for better spatial–frequency coverage, including structured wavelets, dynamically scaled activations, and richer position encodings (Roddenberry et al., 2023, Essakine et al., 6 Nov 2024).
Initialization and Architecture Scaling: Broader adoption of variance-informed initialization for arbitrary activations, adaptation to ultra-deep INR stacks, and scale-invariant architectural innovations (Koneputugodage et al., 27 Apr 2025).
Learning Locality and Hierarchies: MoE and piecewise-continuous architectures, as well as multi-scale and hierarchical representations, seek to balance global expressivity with localized specialization and efficient computation (Ben-Shabat et al., 29 Oct 2024).
Efficient Signal Processing and Inference: Algorithms for direct manipulation of INR-embedded signals, enabling differentiable and permutation-invariant operations, unlock new signal processing and generative pipelines (Xu et al., 2022).
Generalization, Adaptation, and Meta-Learning: Meta-learned initializations and kernels adapted to signal classes (faces, images, shapes) accelerate convergence and enhance transfer; further integration of task-specific priors and unsupervised adaptation remains open (Yüce et al., 2021, Strümpler et al., 2021).
Foundational Theory: Sampling theory, convex-analytic reformulations, and sparsity-driven regularization for INRs are yielding rigorous guarantees on recoverability, uniqueness, and network design (Najaf et al., 28 May 2024, Jayasundara et al., 25 Mar 2025).

7. Unification and Impact Across Domains

The continuous, differentiable, and compact characteristics of INRs yield substantial impact in knowledge representation, geometry-centric learning, compression, scientific computing, and interpretability. Their capacity to model infinite resolution signals and to enable seamless coupling between learning, analysis, and simulation workflows positions INRs as a convergent modeling paradigm. Ongoing research aims to harmonize mathematical insight, computational scalability, and real-world deployment—bridging foundational theory and applied innovation in AI and signal processing (Essakine et al., 6 Nov 2024).