CompGS: Efficient 3D Scene Representation via Compressed Gaussian Splatting (2404.09458v1)
Abstract: Gaussian splatting, renowned for its exceptional rendering quality and efficiency, has emerged as a prominent technique in 3D scene representation. However, the substantial data volume of Gaussian splatting impedes its practical utility in real-world applications. Herein, we propose an efficient 3D scene representation, named Compressed Gaussian Splatting (CompGS), which harnesses compact Gaussian primitives for faithful 3D scene modeling with a remarkably reduced data size. To ensure the compactness of Gaussian primitives, we devise a hybrid primitive structure that captures predictive relationships between each other. Then, we exploit a small set of anchor primitives for prediction, allowing the majority of primitives to be encapsulated into highly compact residual forms. Moreover, we develop a rate-constrained optimization scheme to eliminate redundancies within such hybrid primitives, steering our CompGS towards an optimal trade-off between bitrate consumption and representation efficacy. Experimental results show that the proposed CompGS significantly outperforms existing methods, achieving superior compactness in 3D scene representation without compromising model accuracy and rendering quality. Our code will be released on GitHub for further research.
- End-to-end optimized image compression. In Proceedings of the International Conference on Learning Representations, pages 1–12, 2017.
- Variational image compression with a scale hyperprior. In Proceedings of the International Conference on Learning Representations, pages 1–13, 2018.
- Mip-nerf: A multiscale representation for anti-aliasing neural radiance fields. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 5855–5864, 2021.
- Mip-nerf 360: Unbounded anti-aliased neural radiance fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5470–5479, 2022.
- Compressai: a pytorch library and evaluation platform for end-to-end compression research. arXiv preprint arXiv:2011.03029, pages 1–9, 2020.
- Overview of the versatile video coding (vvc) standard and its applications. IEEE Transactions on Circuits and Systems for Video Technology, 31(10):3736–3764, 2021.
- Cyclical fusion: Accurate 3d reconstruction via cyclical monotonicity. In Proceedings of the ACM International Conference on Multimedia, pages 3955–3964, 2022.
- Gaussianpro: 3d gaussian splatting with progressive propagation. arXiv preprint arXiv:2402.14650, pages 1–11, 2024.
- Lightgaussian: Unbounded 3d gaussian compression with 15x reduction and 200+ fps. arXiv preprint arXiv:2311.17245, pages 1–16, 2023.
- Eagles: Efficient accelerated 3d gaussians with lightweight encodings. arXiv preprint arXiv:2312.04564, pages 1–10, 2023.
- Ges: Generalized exponential splatting for efficient radiance field rendering. arXiv preprint arXiv:2402.10128, 2024.
- Deep blending for free-viewpoint image-based rendering. ACM Transactions on Graphics, 37(6):1–15, 2018.
- Mvlayoutnet: 3d layout reconstruction with multi-view panoramas. In Proceedings of the ACM International Conference on Multimedia, pages 1289–1298, 2022.
- Fvc: An end-to-end framework towards deep video compression in feature space. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(4):4569–4585, 2022.
- Gs++: Error analyzing and optimal gaussian splatting. arXiv preprint arXiv:2402.00752, pages 1–18, 2024.
- Mlic: Multi-reference entropy model for learned image compression. In Proceedings of the ACM International Conference on Multimedia, pages 7618–7627, 2023.
- 3d gaussian splatting for real-time radiance field rendering. ACM Transactions on Graphics, 42(4):1–14, 2023.
- Adam: A method for stochastic optimization. In Proceedings of the International Conference on Learning Representations, pages 1–11, 2015.
- Tanks and temples: Benchmarking large-scale scene reconstruction. ACM Transactions on Graphics, 36(4):1–13, 2017.
- Compact 3d gaussian representation for radiance field. arXiv preprint arXiv:2311.13681, pages 1–10, 2023.
- Dynamic view synthesis with spatio-temporal feature warping from sparse views. In Proceedings of the ACM International Conference on Multimedia, pages 1565–1576, 2023.
- Deep contextual video compression. In Proceedings of the Advances in Neural Information Processing Systems, pages 18114–18125, 2021.
- Hybrid spatial-temporal entropy modelling for neural video compression. In Proceedings of the ACM International Conference on Multimedia, pages 1503–1511, 2022.
- Neural video compression with diverse contexts. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 22616–22626, 2023.
- An efficient four-parameter affine motion model for video coding. IEEE Transactions on Circuits and Systems for Video Technology, 28(8):1934–1948, 2017.
- High-quality 3d face reconstruction with affine convolutional networks. In Proceedings of the ACM International Conference on Multimedia, pages 2495–2503, 2022.
- Scaffold-gs: Structured 3d gaussians for view-adaptive rendering. arXiv preprint arXiv:2312.00109, pages 1–11, 2023.
- Nerf: Representing scenes as neural radiance fields for view synthesis. Communications of the ACM, 65(1):99–106, 2021.
- Joint autoregressive and hierarchical priors for learned image compression. In Proceedings of the Advances in Neural Information Processing Systems, pages 10771–10780, 2018.
- Channel-wise autoregressive entropy models for learned image compression. In Proceedings of the IEEE International Conference on Image Processing, pages 3339–3343, 2020.
- Arithmetic coding revisited. ACM Transactions on Information Systems, 16(3):256–294, 1998.
- Instant neural graphics primitives with a multiresolution hash encoding. ACM Transactions on Graphics, 41(4):1–15, 2022.
- Compact3d: Compressing gaussian splat radiance field models with vector quantization. arXiv preprint arXiv:2311.18159, pages 1–12, 2023.
- Compressed 3d gaussian splatting for accelerated novel view synthesis. arXiv preprint arXiv:2401.02436, pages 1–10, 2023.
- Pytorch: An imperative style, high-performance deep learning library. In Proceedings of the Advances in Neural Information Processing Systems, pages 8026–8037, 2019.
- Structure-from-motion revisited. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 4104–4113, 2016.
- Emerging mpeg standards for point cloud compression. IEEE Journal on Emerging and Selected Topics in Circuits and Systems, 9(1):133–148, 2018.
- Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing, 13(4):600–612, 2004.
- Mvsplenoctree: Fast and generic reconstruction of radiance fields in plenoctree from multi-view stereo. In Proceedings of the ACM International Conference on Multimedia, pages 5114–5122, 2022.
- Bakedsdf: Meshing neural sdfs for real-time view synthesis. In Proceedings of the ACM SIGGRAPH, pages 1–9, 2023.
- Mip-splatting: Alias-free 3d gaussian splatting. arXiv:2311.16493, pages 1–10, 2023.
- The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 586–595, 2018.
- End-to-end learning-based image compression with a decoupled framework. IEEE Transactions on Circuits and Systems for Video Technology, pages 1–14, 2023.
- Ewa splatting. IEEE Transactions on Visualization and Computer Graphics, 8(3):223–238, 2002.