Towards Lossless Implicit Neural Representation via Bit Plane Decomposition (2502.21001v2)

Published 28 Feb 2025 in cs.CV

Abstract: We quantify the upper bound on the size of the implicit neural representation (INR) model from a digital perspective. The upper bound of the model size increases exponentially as the required bit-precision increases. To this end, we present a bit-plane decomposition method that makes INR predict bit-planes, producing the same effect as reducing the upper bound of the model size. We validate our hypothesis that reducing the upper bound leads to faster convergence with constant model size. Our method achieves lossless representation in 2D image and audio fitting, even for high bit-depth signals, such as 16-bit, which was previously unachievable. We pioneered the presence of bit bias, which INR prioritizes as the most significant bit (MSB). We expand the application of the INR task to bit depth expansion, lossless image compression, and extreme network quantization. Our source code is available at https://github.com/WooKyoungHan/LosslessINR

Summary

The paper introduces a novel framework using bit-plane decomposition to achieve lossless implicit neural representations, significantly reducing the theoretical parameter upper bound required for high bit-depth signals.
This approach trains a unified function for decomposed binary bit-planes, empirically observing a 'bit bias' where neural networks learn more significant bits faster than less significant ones.
Experimental results demonstrate accelerated convergence towards true lossless reconstruction and successful application of the framework to tasks like image compression and bit-depth expansion.

The paper presents a novel framework for lossless implicit neural representations (INRs) by exploiting bit-plane decomposition that effectively lowers the theoretical upper bound on the number of parameters required to represent a discrete signal at a specified bit precision. The work is grounded in recent theoretical results quantifying error tolerance versus model complexity for coordinate-based multilayer perceptrons (MLPs), and it derives an explicit formulation for an upper bound function, $\mathcal{U}_d(n) = \mathfrak{C} \left(2^{n+1} - 2\right)^{2d},$ where

$n$ denotes the bit precision,
$d$ is the input dimension, and
$\mathfrak{C}$ is a constant determined by domain properties. This formulation implies an exponential growth in the parameter count with increasing bit precision, which poses a significant challenge when representing high bit-depth signals.

The key idea is to decompose an n-bit signal into its constituent bit-planes. Specifically, an n-bit image is reconstructed via $\mathbf{I}_n = \frac{1}{2^n-1}\sum_{i=0}^{n-1} 2^i \mathbf{B}^{(i)},$ with each bit-plane $\mathbf{B}^{(i)}\in\{0,1\}^{H\times W\times3}$ representing the binary image corresponding to the i‑th bit. By modeling each bit-plane as a separate 1-bit signal, the upper bound is dramatically reduced to $\mathcal{U}_d(1)$ , which is orders of magnitude lower than that for full n-bit representations. This reduction in the required parameter capacity speeds up convergence toward lossless reconstruction given a fixed model size.

Major contributions and findings of the paper include:

Theoretical Quantification of Parameter Upper Bound:

The authors rigorously derive the exponential relationship between bit precision and the sufficient number of network parameters based on error tolerance. This result forms the basis for motivating a reduction of the effective bit precision by decomposing the target signal into bit-planes.

Bit-Plane Decomposition Strategy:

Instead of directly regressing a high bit-depth signal, the method reformulates the problem by learning a function $\hat{\mathbf{B}}^{(i)}(\mathbf{x}) \approx f_\theta(\mathbf{x}, i)$ that simultaneously takes spatial coordinates and a bit-index as inputs. This leverages the inherent correlation among bit-planes and allows training with binary cross-entropy (BCE) loss—which empirically exhibits faster convergence compared to pixel-wise regression losses such as mean squared error (MSE).

Discovery of Bit Bias Phenomenon:

A notable empirical observation is that INRs tend to learn the most significant bits (MSBs) much faster than the least significant bits (LSBs). This “bit bias” mirrors the well-known spectral bias in coordinate-MLPs, but here it emerges along the discrete bit axis. The paper further explores a “bit-spectral bias” indicating that certain pixel intensities (e.g., very high or low values) are fit more readily, which guides the design of losses and network architectures.

Enhanced Convergence and Lossless Representation:

Experimental results on both 16-bit and 8-bit image datasets demonstrate that by reducing the upper bound through bit-plane decomposition, convergence accelerates considerably. The method achieves true lossless reconstruction, as indicated by bit-error rates (BER) of zero and infinitely high peak signal-to-noise ratios (PSNR) once losslessness is reached. Quantitative comparisons show that, for the same number of parameters, the proposed method converges using roughly 30–40% fewer iterations compared with baseline INR approaches using conventional activations (such as sine or ReLU with positional encoding).

Extensions to Related Applications:

The framework is also applied to bit-depth expansion, lossless compression, and extreme network quantization. Notably, by shifting to a ternary-weight INR model (with weights constrained to {–1, 0, 1}), the paper demonstrates the possibility of achieving lossless representations with greatly reduced memory footprint and lower bit operations per pixel. In addition, when trained in a self-supervised manner on only the MSBs, the network can extrapolate to reconstruct the missing lower bits, outperforming conventional rule-based and learning-based bit-depth expansion methods.

Additional experimental findings include:

Convergence Curves and Hypothesis Validation:

Plots comparing the iteration count versus the BER and PSNR clearly show that when the model’s parameter count is close to the derived upper bound, convergence is notably faster. Ablation studies reveal that the BCE loss function is particularly effective for obtaining lossless representations in this binary decomposition setting.

Combining with Existing Architectures:

The authors also perform experiments in which the bit-plane decomposition strategy is integrated with hash-based methods (e.g., Instant-NGP) or efficient activations (e.g., FINER). These combinations further reduce the iteration counts required to reach losslessness, demonstrating the versatility of the proposed approach.

In summary, the paper rigorously addresses the fundamental limits of lossless implicit neural representations imposed by digital quantization. By decomposing high bit-depth signals into binary bit-planes, the approach reduces the required model capacity and accelerates convergence, while also revealing novel insights into bit bias during INR training. The work offers a unified framework with practical applications in image compression, bit-depth expansion, and model quantization, making it a significant contribution to the field of implicit neural representations.