EAGLES: Efficient Accelerated 3D Gaussians with Lightweight EncodingS (2312.04564v3)
Abstract: Recently, 3D Gaussian splatting (3D-GS) has gained popularity in novel-view scene synthesis. It addresses the challenges of lengthy training times and slow rendering speeds associated with Neural Radiance Fields (NeRFs). Through rapid, differentiable rasterization of 3D Gaussians, 3D-GS achieves real-time rendering and accelerated training. They, however, demand substantial memory resources for both training and storage, as they require millions of Gaussians in their point cloud representation for each scene. We present a technique utilizing quantized embeddings to significantly reduce per-point memory storage requirements and a coarse-to-fine training strategy for a faster and more stable optimization of the Gaussian point clouds. Our approach develops a pruning stage which results in scene representations with fewer Gaussians, leading to faster training times and rendering speeds for real-time rendering of high resolution scenes. We reduce storage memory by more than an order of magnitude all while preserving the reconstruction quality. We validate the effectiveness of our approach on a variety of datasets and scenes preserving the visual quality while consuming 10-20x lesser memory and faster training/inference speed. Project page and code is available https://efficientgaussian.github.io
- Post-training 4-bit quantization of convolution networks for rapid-deployment. arXiv preprint arXiv:1810.05723, 2018.
- Mip-nerf: A multiscale representation for anti-aliasing neural radiance fields. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 5855–5864, 2021.
- Mip-nerf 360: Unbounded anti-aliased neural radiance fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5470–5479, 2022.
- 3d scene compression through entropy penalized neural representation functions. In 2021 Picture Coding Symposium (PCS), pages 1–5. IEEE, 2021.
- Nerv: Neural representations for videos. Advances in Neural Information Processing Systems, 34:21557–21568, 2021.
- Compressing neural networks with the hashing trick. In International conference on machine learning, pages 2285–2294. PMLR, 2015.
- Compressing convolutional neural networks in the frequency domain. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1475–1484, 2016.
- Binaryconnect: Training deep neural networks with binary weights during propagations. In Advances in neural information processing systems, pages 3123–3131, 2015.
- Llm. int8 (): 8-bit matrix multiplication for transformers at scale. arXiv preprint arXiv:2208.07339, 2022.
- Coin: Compression with implicit neural representations. arXiv preprint arXiv:2103.03123, 2021.
- The lottery ticket hypothesis: Finding sparse, trainable neural networks. arXiv preprint arXiv:1803.03635, 2018.
- Pruning neural networks at initialization: Why are we missing the mark? arXiv preprint arXiv:2009.08576, 2020.
- Plenoxels: Radiance fields without neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5501–5510, 2022.
- The lottery ticket hypothesis for object recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 762–771, 2021.
- Lilnetx: Lightweight networks with extreme model compression and structured sparsification. arXiv preprint arXiv:2204.02965, 2022.
- Shacira: Scalable hash-grid compression for implicit neural representations. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 17513–17524, 2023.
- Compressing deep convolutional networks using vector quantization. arXiv preprint arXiv:1412.6115, 2014.
- Learning both weights and connections for efficient neural networks. arXiv preprint arXiv:1506.02626, 2015.
- Deep blending for free-viewpoint image-based rendering. ACM Transactions on Graphics (ToG), 37(6):1–15, 2018.
- The hat matrix in regression and anova. The American Statistician, 32(1):17–22, 1978.
- 3d gaussian splatting for real-time radiance field rendering. ACM Transactions on Graphics (ToG), 42(4):1–14, 2023.
- Tanks and temples: Benchmarking large-scale scene reconstruction. ACM Transactions on Graphics (ToG), 36(4):1–13, 2017.
- Optimal brain damage. In Advances in neural information processing systems, pages 598–605, 1990.
- Ternary weight networks. arXiv preprint arXiv:1605.04711, 2016.
- Compressing volumetric radiance fields to 1 mb. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4222–4231, 2023.
- Learning neural acoustic fields. Advances in Neural Information Processing Systems, 35:3165–3177, 2022.
- Nirvana: Neural implicit representations of videos with adaptive networks and autoregressive patch-wise modeling. arXiv preprint arXiv:2212.14593, 2022.
- Nerf: Representing scenes as neural radiance fields for view synthesis. In ECCV, 2020.
- Instant neural graphics primitives with a multiresolution hash encoding. ACM Transactions on Graphics (ToG), 41(4):1–15, 2022.
- Scalable model compression by entropy penalized reparameterization. arXiv preprint arXiv:1906.06624, 2019.
- Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, 32, 2019.
- Xnor-net: Imagenet classification using binary convolutional neural networks. In European conference on computer vision, pages 525–542. Springer, 2016.
- Russell Reed. Pruning algorithms-a survey. IEEE transactions on Neural Networks, 4(5):740–747, 1993.
- Winning the lottery with continuous sparsification. Advances in Neural Information Processing Systems, 33:11380–11390, 2020.
- Robert T Seeley. Spherical harmonics. The American Mathematical Monthly, 73(4P2):115–121, 1966.
- Metasdf: Meta-learning signed distance functions. Advances in Neural Information Processing Systems, 33:10136–10147, 2020a.
- Implicit neural representations with periodic activation functions. Advances in neural information processing systems, 33:7462–7473, 2020b.
- Implicit neural representations for image compression. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXVI, pages 74–91. Springer, 2022.
- Neural geometric level of detail: Real-time rendering with implicit 3d shapes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11358–11367, 2021.
- Variable bitrate neural fields. In ACM SIGGRAPH 2022 Conference Proceedings, pages 1–9, 2022.
- Learned initializations for optimizing coordinate-based neural representations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2846–2855, 2021.
- Rtmv: A ray-traced multi-view synthetic dataset for novel view synthesis. IEEE/CVF European Conference on Computer Vision Workshop (Learn3DG ECCVW), 2022, 2022.
- Shimon Ullman. The interpretation of structure from motion. Proceedings of the Royal Society of London. Series B. Biological Sciences, 203(1153):405–426, 1979.
- Lq-nets: Learned quantization for highly accurate and compact deep neural networks. In Proceedings of the European conference on computer vision (ECCV), pages 365–382, 2018.
- Ewa volume splatting. In Proceedings Visualization, 2001. VIS’01., pages 29–538. IEEE, 2001a.
- Surface splatting. In Proceedings of the 28th annual conference on Computer graphics and interactive techniques, pages 371–378, 2001b.
- Sharath Girish (11 papers)
- Kamal Gupta (22 papers)
- Abhinav Shrivastava (122 papers)