Implicit Compression: Function-Based Encoding

Updated 14 April 2026

Implicit Compression is a data compression method relying on implicit neural representations, where a compact neural function is overfitted to continuous data samples.
It optimizes a rate–distortion objective by learning per-sample network parameters that are quantized and entropy-coded, ensuring efficient storage and precise reconstruction.
Applications span images, video, point clouds, and scientific data, demonstrating advantages in spatial adaptivity, frequency control, and versatile encoding.

Implicit compression is a method of data compression in which signals (such as images, video, volumes, or point clouds) are represented not by storing sampled values or transform coefficients, but by overfitting a compact, continuous neural function—an implicit neural representation (INR)—to the data. The compressed bitstream consists of the learned parameters of this network, potentially after quantization, entropy coding, or further model compression. Decoding then reconstructs the data by evaluating the network on desired coordinates. This paradigm has emerged as a highly general, function-based alternative to classical and latent-space codecs for a wide range of modalities, offering unique advantages in representation flexibility, spatial adaptivity, and direct rate–distortion control (Yang et al., 2022, Strümpler et al., 2021, Schwarz et al., 2023).

1. Mathematical Formulation of Implicit Compression

A canonical INR is a parameterized function $f_\theta:\mathbb{R}^M\to\mathbb{R}^C$ mapping continuous coordinates $x$ to signal values $y$ . In implicit compression, the signal $d(x)$ (e.g., image intensities, voxel occupancy, RGB colors) is approximated as $f_\theta(x)\approx d(x)$ over the domain of interest (pixels, voxels, spatial/temporal/spectral coordinates) (Yang et al., 2022, Dupont et al., 2021, Cho et al., 2 Jun 2025, Dai et al., 2024). For multi-dimensional data (images, videos, point clouds), $x$ can encompass spatial, angular, temporal, band, or view indices.

The compression process is thus reduced to learning “per-sample” the network parameters $\theta^*$ that minimize a distortion metric $D(\theta)=\mathbb{E}_x\|f_\theta(x)-d(x)\|^2$ . Once optimized, these weights are quantized ( $q(\theta)$ ), perhaps sparsified or pruned, and entropy-coded (using learned, empirical, or parametric priors). The compressed bitstream consists mainly of the coded $\theta$ , plus minor metadata (Strümpler et al., 2021, Damodaran et al., 2023, Cho et al., 2 Jun 2025, Wang et al., 2024).

The fundamental rate–distortion objective is

$x$ 0

where $x$ 1 denotes the total coded bit-length and $x$ 2 trades off reconstruction distortion against bitrate (Strümpler et al., 2021, Damodaran et al., 2023, Cho et al., 2 Jun 2025).

2. Theoretical underpinnings and Spectral Properties

Implicit neural compression leverages specific inductive biases and spectral characteristics of the neural architectures. Shallow MLPs with sinusoidal activation (e.g., SIREN) embedded with Fourier features tend to concentrate approximation power around frequencies represented by their input encoding and hidden-layer width, leading to a “spectrum concentration” effect. This property strongly governs parameter efficiency and reconstruction fidelity for data with broadband or multiscale content (Yang et al., 2022).

For a given function $x$ 3 with a target spectrum, the number of INR parameters, the frequency basis of the encoding, and the depth of the network determine how well high-frequency details are preserved. A key finding is that most spectral energy in shallow sinusoidal networks is localized around base frequencies, with amplitude in higher-order harmonics suppressed by the Bessel function decay in the Jacobi–Anger expansion. Thus, signals with broad or spatially-varying spectrum benefit from block-wise partitioning and local models (Yang et al., 2022, Yang et al., 2022).

3. Model Architectures and Compression Workflows

The INR used for implicit compression is usually a compact MLP with periodic (sinusoidal, cosine), ReLU, or learnable activations, often applied after a Fourier feature positional encoding for high-frequency fidelity (Dupont et al., 2021, Yang et al., 2022, Zhang et al., 20 Apr 2025). Notable architectural augmentations include

Low-rank and adaptive basis factorization for reducing model size while retaining expressivity (e.g., ImpliSat framework uses low-rank modulated SIREN MLPs for multispectral image compression) (Cho et al., 2 Jun 2025).
Hierarchical parameter sharing, as in Tree-structured Implicit Neural Compression (TINC), which encodes both local and non-local redundancy by sharing early layers and only specializing late layers or additive blocks for high-frequency or region-specific details (Yang et al., 2022).
Block-wise partitioning, spectrum-based splitting (e.g., SCI), or patch-based optimization, allocating network capacity according to local frequency content or spatial statistics (Yang et al., 2022).
Learnable and adaptive activation functions in latent space, such as LeAFNet for point clouds, allowing fine spectral adaptation while maintaining a compact parameterization (Zhang et al., 20 Apr 2025).

The generic workflow is:

Fit an INR $x$ 4 to the target data sample;
Quantize and entropy-code the weights $x$ 5 (with bit-precise, possibly layer-wise quantization, entropy coding, or variational models);
Transmit the resulting bitstream; and
Decode by reconstructing the signal as $x$ 6 on the target grid.

In video and dynamic point-cloud compression, spatio-temporal or view-time coordinates drive the INR, with temporal redundancy handled either via sequential parameterization, residuals, or global 4D embedding (Gao et al., 25 Mar 2025, Ruan et al., 2024). For multi-spectral images, dynamic Fourier modulation and hypernetworks adapt the parameterization per band and resolution (Cho et al., 2 Jun 2025).

4. Quantization, Entropy Coding, and Rate–Distortion Control

Compressed INRs are made bit-efficient via aggressive quantization (e.g., down to 7–8 bit precision with minimal degradation using quantization-aware training or AdaRound) (Strümpler et al., 2021, Damodaran et al., 2023), border-aware and empirical entropy coding of quantized weights (Damodaran et al., 2023, Strümpler et al., 2021), and learned priors on parameters (Gaussian, variational, data-driven) (Guo et al., 2023, Schwarz et al., 2023).

Some frameworks perform meta-learning (e.g., with MAML) to find a global initialization for quick per-sample fitting and to minimize the amplitude and entropy of weight updates, which facilitates better quantization and entropy coding (Strümpler et al., 2021). Bayesian approaches fit variational posteriors to parameters and perform entropy coding using relative-entropy codes, setting rate precisely as the KL divergence to the prior ( $x$ 7) (Guo et al., 2023).

The overall rate–distortion optimization is sometimes convexified and controlled directly during fitting by Lagrange multipliers or alternating quantization-aware and distortion-reducing gradient steps (Dupont et al., 2021, Damodaran et al., 2023, Cho et al., 2 Jun 2025).

5. Applications and Empirical Results Across Modalities

Images, Light Fields, and Volumetric Data

Simple INR codecs (COIN, RQAT-INR) can outpace JPEG and approach learned autoencoders at low bitrates, with competitive PSNR and lower decoder complexity (Dupont et al., 2021, Damodaran et al., 2023).
Meta-learned and quantization-aware INRs can close the gap to JPEG2000 and BPG at mid rates; INRs trained with entropy coding and advanced architecture outperform block coders on certain benchmarks (Strümpler et al., 2021, Damodaran et al., 2023).
Light-field INR compression encodes SAIs as functions of (u,v), storing angular redundancy implicitly, and achieves comparable or superior perceptual quality versus classical approaches (Wang et al., 2024).

Biomedical, Climate, and Scientific Data

Spectrum-concentrated and tree-based INRs (SCI, TINC) achieve state-of-the-art performance on medical volumes, neuronal structures, and atmospheric fields, outperforming HEVC and tailored codecs at high compression ratios (up to 512–1024×) (Yang et al., 2022, Yang et al., 2022, Xu et al., 2024).
Application-specific guidance (segmentation, perceptual, or task loss) can be seamlessly incorporated during fitting to prioritize critical features in microscopy or scientific imaging (Dai et al., 2024).

Point Cloud and Dynamic Scene Compression

Per-instance INR fitting for geometry and color fields, with quantization and adaptive activations, surpasses octree and G-PCC standards for both static and dynamic point clouds, with up to +7 dB PSNR improvement and up to 89% bit-rate reduction in dynamic settings (Zhang et al., 20 Apr 2025, Ruan et al., 2024).

Video and Multi-View

Implicit Neural Video Compression (IPF, GIViC) leverages coordinate-mapped frame prediction or full-GOP (Group of Pictures) functional fitting, outperforming MPEG-4 at low rates and, with generative hierarchical models, surpassing VTM and state-of-the-art neural codecs on benchmark sets (Gao et al., 25 Mar 2025, Zhang et al., 2021).
Multi-view integration approaches blend implicit and explicit coding, using INR-based codecs to encode all but one anchor view, enabling state-of-the-art scene modeling and improved rate-distortion over MIV (Zhu et al., 2023).

Advanced: Quantum and Modality-Agnostic Compression

Quantum neural architectures (quINR) improve rate-distortion by leveraging exponential Hilbert space capacity, delivering up to +1.2 dB PSNR gain over classical MLP INR on images at a given bit rate (Fujihashi et al., 2024).
Modality-agnostic variational INR compression (VC-INR) yields state-of-the-art performance across images, audio, climate, and 3D scenes, with architecture-agnostic soft-subnetwork gating and learned entropy models for the latent space (Schwarz et al., 2023).

6. Limitations, Generalization, and Extensions

The principal limitations of implicit compression are encoder-side computational cost (per-sample fitting generally requires thousands of gradient steps), challenges in capturing high-frequency content with small models (spectral bandwidth bottleneck), and the lack of fast, amortized inference for deployment at scale (Dupont et al., 2021, Strümpler et al., 2021, Damodaran et al., 2023, Xu et al., 2024). Overfitting-based methods may blur high-frequency, nonstationary structures unless model size or block count is increased, or spectrum-based partitioning is employed (Yang et al., 2022).

Recent approaches address these by:

Incorporating tree- or spectrum-structured block division with hierarchical parameter sharing (Yang et al., 2022, Yang et al., 2022).
Utilizing learned optimizers and meta-initializations to reduce fitting time (Dai et al., 2024, Strümpler et al., 2021).
Hybridizing explicit codecs for the dominant portions of data and using INR-based residual coding or adapters for the error, accelerating fitting without sacrificing quality (Dai et al., 2024, Zhu et al., 2023).
Employing probabilistic and variational compression methods (e.g., variational Bayesian INRs, relative-entropy coding) that provide finer rate-distortion control, improved entropy modeling, and modality-agnostic applicability (Guo et al., 2023, Schwarz et al., 2023).

Extensions under active study include neural architecture search for function parameterization, adaptively-learned quantization or pruning schedules, block-wise fitting for massive scenes, hybrid quantum-classical architectures, and expansion to new modalities including hyperspectral, scientific, environmental, and multi-modal data (Fujihashi et al., 2024, Cho et al., 2 Jun 2025, Xu et al., 2024).

7. Summary Table of Recent Implicit Compression Approaches

Model/Framework	Domain	Key Architectural Features	Rate-Distortion/Empirical Outcomes	Reference
COIN, RQAT-INR	Images	MLP+Fourier features, QAT, AdaRound	Outperforms JPEG/JPEG2000 at low bpp	(Dupont et al., 2021, Damodaran et al., 2023)
TINC	Volumes	Tree-based, shared MLP layers	+1–2 dB PSNR vs. SOTA at 512–1024× compression	(Yang et al., 2022)
SCI	Biomedical	Spectrum-adaptive partition, funnel MLP	State-of-the-art on CT/sparse data, PSNR, Acc	(Yang et al., 2022)
GIViC, IPF	Video	INR+diffusion/transformer, coordinate flow	Outperforms VVC/VTM, MPEG-4 at low/RA bitrate	(Gao et al., 25 Mar 2025, Zhang et al., 2021)
ImpliSat	Multispectral	Fourier modulation, per-band adaptation	10× compression at PSNR=36 dB (Sentinel-2)	(Cho et al., 2 Jun 2025)
PICO, NeRC³	Point Clouds	Learnable activation, joint geometry/color	+4–7 dB vs. G-PCC, up to 89% BD-BR reduction	(Zhang et al., 20 Apr 2025, Ruan et al., 2024)
INIF, HiHa	Microscopy, Atmos	Multi-loss INR, harmonic decomposition	Up to 512× compression, 20–40× compression for 1e-3 RMSE	(Dai et al., 2024, Xu et al., 2024)
VC-INR, COMBINER	Multi-modal	Variational quantization, gating, Bayes prior	Outperforms JPEG2000, MP3, HEVC cross-modality	(Schwarz et al., 2023, Guo et al., 2023)

Implicit compression thus constitutes a unifying and highly extensible framework for modality-agnostic, function-based compression, capable of rivaling or surpassing expert-tuned codecs in diverse domains through advances in rate-distortion optimization, network architecture, and entropy modeling.