An Efficient Implicit Neural Representation Image Codec Based on Mixed Autoregressive Model for Low-Complexity Decoding (2401.12587v2)
Abstract: Displaying high-quality images on edge devices, such as augmented reality devices, is essential for enhancing the user experience. However, these devices often face power consumption and computing resource limitations, making it challenging to apply many deep learning-based image compression algorithms in this field. Implicit Neural Representation (INR) for image compression is an emerging technology that offers two key benefits compared to cutting-edge autoencoder models: low computational complexity and parameter-free decoding. It also outperforms many traditional and early neural compression methods in terms of quality. In this study, we introduce a new Mixed AutoRegressive Model (MARM) to significantly reduce the decoding time for the current INR codec, along with a new synthesis network to enhance reconstruction quality. MARM includes our proposed AutoRegressive Upsampler (ARU) blocks, which are highly computationally efficient, and ARM from previous work to balance decoding time and reconstruction quality. We also propose enhancing ARU's performance using a checkerboard two-stage decoding strategy. Moreover, the ratio of different modules can be adjusted to maintain a balance between quality and speed. Comprehensive experiments demonstrate that our method significantly improves computational efficiency while preserving image quality. With different parameter settings, our method can achieve over a magnitude acceleration in decoding time without industrial level optimization, or achieve state-of-the-art reconstruction quality compared with other INR codecs. To the best of our knowledge, our method is the first INR-based codec comparable with Hyperprior in both decoding speed and quality while maintaining low complexity.
- End-to-end optimized image compression. arXiv preprint arXiv:1611.01704 (2016).
- Variational image compression with a scale hyperprior. (2018).
- Robert Bamler. 2022. Understanding Entropy Coding With Asymmetric Numeral Systems (ANS): a Statistician’s Perspective. arXiv preprint arXiv:2201.01741 (2022).
- Fabrice Bellard. 2018. BPG Image format. https://bellard.org/bpg/
- CompressAI: a PyTorch library and evaluation platform for end-to-end compression research. (2020).
- Learning continuous image representation with local implicit image function. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 8628–8638.
- Zhiqin Chen and Hao Zhang. 2019. Learning implicit fields for generative shape modeling. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5939–5948.
- Learned image compression with discretized gaussian mixture likelihoods and attention modules. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 7939–7948.
- Coin: Compression with implicit neural representations. arXiv preprint arXiv:2103.03123 (2021).
- Coin++: Data agnostic neural compression. arXiv preprint arXiv:2201.12904 1, 2 (2022), 4.
- Generative Models as Distributions of Functions. (2021).
- V. K Goyal. 2001. Theoretical foundations of transform coding. IEEE Signal Processing Magazine 18, 5 (2001), 9–21.
- ELIC: Efficient Learned Image Compression with Unevenly Grouped Space-Channel Contextual Adaptive Coding. arXiv e-prints (2022).
- Checkerboard context model for efficient learned image compression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14771–14780.
- Cool-chic: Coordinate-based low complexity hierarchical image codec. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 13515–13522.
- Low-complexity Overfitted Neural Image Codec. arXiv preprint arXiv:2307.12706 (2023).
- Neural video compression with diverse contexts. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 22616–22626.
- NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis.
- Joint Autoregressive and Hierarchical Priors for Learned Image Compression. (2018).
- David Minnen and Saurabh Singh. 2020. Channel-wise Autoregressive Entropy Models for Learned Image Compression.
- Image-dependent local entropy models for learned image compression. In 2018 25th IEEE International Conference on Image Processing (ICIP). IEEE, 430–434.
- Instant neural graphics primitives with a multiresolution hash encoding. ACM Transactions on Graphics (ToG) 41, 4 (2022), 1–15.
- Deepsdf: Learning continuous signed distance functions for shape representation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 165–174.
- Parallel multiscale autoregressive density estimation. In International conference on machine learning. PMLR, 2912–2921.
- Implicit neural representations for image compression. In European Conference on Computer Vision. Springer, 74–91.
- Conditional image generation with pixelcnn decoders. Advances in neural information processing systems 29 (2016).
- Pixel recurrent neural networks. In International conference on machine learning. PMLR, 1747–1756.
- Gregory K Wallace. 1992. The JPEG still picture compression standard. IEEE transactions on consumer electronics 38, 1 (1992), xviii–xxxiv.
- Multiscale structural similarity for image quality assessment. In The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003, Vol. 2. Ieee, 1398–1402.