Image-GS: Content-Adaptive Image Representation via 2D Gaussians (2407.01866v2)
Abstract: Neural image representations have emerged as a promising approach for encoding and rendering visual data. Combined with learning-based workflows, they demonstrate impressive trade-offs between visual fidelity and memory footprint. Existing methods in this domain, however, often rely on fixed data structures that suboptimally allocate memory or compute-intensive implicit models, hindering their practicality for real-time graphics applications. Inspired by recent advancements in radiance field rendering, we introduce Image-GS, a content-adaptive image representation based on 2D Gaussians. Leveraging a custom differentiable renderer, Image-GS reconstructs images by adaptively allocating and progressively optimizing a group of anisotropic, colored 2D Gaussians. It achieves a favorable balance between visual fidelity and memory efficiency across a variety of stylized images frequently seen in graphics workflows, especially for those showing non-uniformly distributed features and in low-bitrate regimes. Moreover, it supports hardware-friendly rapid random access for real-time usage, requiring only 0.3K MACs to decode a pixel. Through error-guided progressive optimization, Image-GS naturally constructs a smooth level-of-detail hierarchy. We demonstrate its versatility with several applications, including texture compression, semantics-aware compression, and joint image compression and restoration.
- 2024. AMD Compressonator. https://gpuopen.com/compressonator/.
- 2024. Texture Block Compression in Direct3D 11. https://learn.microsoft.com/en-us/windows/win32/direct3d11/texture-block-compression-in-direct3d-11.
- JPEG XL next-generation image compression architecture and coding tools. In Applications of digital image processing XLII, Vol. 11137. SPIE, 112–124.
- FLIP: A Difference Evaluator for Alternating Images. Proc. ACM Comput. Graph. Interact. Tech. 3, 2 (2020), 15–1.
- Image coding using wavelet transform. IEEE Trans. Image Processing 1 (1992), 20–5.
- Variational image compression with a scale hyperprior. arXiv preprint arXiv:1802.01436 (2018).
- Discontinuity-Aware 2D Neural Fields. ACM Transactions on Graphics (TOG) 42, 6 (2023), 1–11.
- Two bit/pixel full color encoding. ACM SIGGRAPH Computer Graphics 20, 4 (1986), 215–223.
- Guikun Chen and Wenguan Wang. 2024. A survey on 3d gaussian splatting. arXiv preprint arXiv:2401.03890 (2024).
- Learning continuous image representation with local implicit image function. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 8628–8638.
- An overview of core coding tools in the AV1 video codec. In 2018 picture coding symposium (PCS). IEEE, 41–45.
- Chang-Chieh Cheng. 2024. Image representation and reconstruction by compositing Gaussian ellipses. IET Image Processing 18, 2 (2024), 493–506.
- Edward Delp and O Mitchell. 1979. Image compression using block truncation coding. IEEE transactions on Communications 27, 9 (1979), 1335–1342.
- An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In International Conference on Learning Representations.
- Alexander Gepperth and Benedikt Pfülb. 2021. Image modeling with deep convolutional gaussian mixture models. In 2021 International Joint Conference on Neural Networks (IJCNN). IEEE, 1–9.
- 2D Gaussian Splatting for Geometrically Accurate Radiance Fields. arXiv preprint arXiv:2403.17888 (2024).
- Relu fields: The little non-linearity that could. In ACM SIGGRAPH 2022 Conference Proceedings. 1–9.
- 3D Gaussian Splatting for Real-Time Radiance Field Rendering. ACM Transactions on Graphics (TOG) 42, 4 (2023), 1–14.
- Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings.
- Neural point catacaustics for novel-view synthesis of reflections. ACM Transactions on Graphics (TOG) 41, 6 (2022), 1–15.
- Point-Based Neural Rendering with Per-View Optimization. In Computer Graphics Forum, Vol. 40. Wiley Online Library, 29–43.
- Towards streaming perception. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part II 16. Springer, 473–488.
- Neural volumes: learning dynamic renderable volumes from images. ACM Transactions on Graphics (TOG) 38, 4 (2019), 1–14.
- Dynamic 3d gaussians: Tracking by persistent dynamic view synthesis. arXiv preprint arXiv:2308.09713 (2023).
- Acorn: adaptive coordinate networks for neural scene representation. ACM Transactions on Graphics (TOG) 40, 4 (2021), 1–13.
- Instant neural graphics primitives with a multiresolution hash encoding. ACM transactions on graphics (TOG) 41, 4 (2022), 1–15.
- Differentiable volumetric rendering: Learning implicit 3d representations without 3d supervision. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 3504–3515.
- Adaptive scalable texture compression. In Proceedings of the Fourth ACM SIGGRAPH/Eurographics Conference on High-Performance Graphics. 105–114.
- Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019), 8026–8037.
- Wire: Wavelet implicit neural representations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 18507–18516.
- Implicit neural representations with periodic activation functions. Advances in neural information processing systems 33 (2020), 7462–7473.
- Deepvoxels: Learning persistent 3d feature embeddings. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2437–2446.
- Scene representation networks: Continuous 3d-structure-aware neural scene representations. Advances in Neural Information Processing Systems 32 (2019).
- N-Dimensional Gaussians for Fitting of High Dimensional Functions. In ACM SIGGRAPH 2024 Conference Proceedings. 1–9.
- Jacob Ström and Tomas Akenine-Möller. 2005. i PACKMAN: High-quality, low-complexity texture compression for mobile phones. In Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware. 63–70.
- Jacob Ström and Martin Pettersson. 2007. ETC 2: texture compression using invalid combinations. In Graphics Hardware, Vol. 7. 49–54.
- 3dgstream: On-the-fly training of 3d gaussians for efficient streaming of photo-realistic free-viewpoint videos. arXiv preprint arXiv:2403.01444 (2024).
- Image compression using GMM model optimization. In ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 1797–1801.
- Image compression based on Gaussian mixture model constrained using Markov random field. Signal Processing 183 (2021), 107990.
- Fourier features let networks learn high frequency functions in low dimensional domains. Advances in neural information processing systems 33 (2020), 7537–7547.
- Random-Access Neural Compression of Material Textures. ACM Transactions on Graphics (TOG) 42, 4 (2023), 1–25.
- Gregory K Wallace. 1992. The JPEG still picture compression standard. IEEE transactions on consumer electronics 38, 1 (1992), xviii–xxxiv.
- Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing 13, 4 (2004), 600–612.
- Terry A Welch. 1985. High speed data compression and decompression apparatus and method. US Patent 4,558,302.
- Differentiable surface splatting for point-based geometry processing. ACM Transactions on Graphics (TOG) 38, 6 (2019), 1–14.
- The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE conference on computer vision and pattern recognition. 586–595.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.