LightGaussian: Unbounded 3D Gaussian Compression with 15x Reduction and 200+ FPS (2311.17245v6)
Abstract: Recent advances in real-time neural rendering using point-based techniques have enabled broader adoption of 3D representations. However, foundational approaches like 3D Gaussian Splatting impose substantial storage overhead, as Structure-from-Motion (SfM) points can grow to millions, often requiring gigabyte-level disk space for a single unbounded scene. This growth presents scalability challenges and hinders splatting efficiency. To address this, we introduce LightGaussian, a method for transforming 3D Gaussians into a more compact format. Inspired by Network Pruning, LightGaussian identifies Gaussians with minimal global significance on scene reconstruction, and applies a pruning and recovery process to reduce redundancy while preserving visual quality. Knowledge distillation and pseudo-view augmentation then transfer spherical harmonic coefficients to a lower degree, yielding compact representations. Gaussian Vector Quantization, based on each Gaussian's global significance, further lowers bitwidth with minimal accuracy loss. LightGaussian achieves an average 15x compression rate while boosting FPS from 144 to 237 within the 3D-GS framework, enabling efficient complex scene representation on the Mip-NeRF 360 and Tank & Temple datasets. The proposed Gaussian pruning approach is also adaptable to other 3D representations (e.g., Scaffold-GS), demonstrating strong generalization capabilities.
- Structured pruning of deep convolutional neural networks. ACM Journal on Emerging Technologies in Computing Systems (JETC), 13(3):1–18, 2017.
- Learning Neural Light Fields With Ray-Space Embedding Networks. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 19787–19797, 2022.
- Do deep nets really need to be deep? Advances in neural information processing systems, 27, 2014.
- Mip-Nerf: A Multiscale Representation for Anti-Aliasing Neural Radiance Fields. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 5855–5864, 2021.
- Mip-NeRF 360: Unbounded Anti-Aliased Neural Radiance Fields. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 5460–5469, 2022a.
- Mip-nerf 360: Unbounded anti-aliased neural radiance fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5470–5479, 2022b.
- James F Blinn. A generalization of algebraic surface drawing. ACM transactions on graphics (TOG), 1(3):235–256, 1982.
- Model compression. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 535–541, 2006.
- Segment anything in 3d with nerfs. arXiv preprint arXiv:2304.12308, 2023.
- Tensorf: Tensorial radiance fields. In European Conference on Computer Vision, pages 333–350. Springer, 2022.
- Visual categorization with bags of keypoints. In Workshop on statistical learning in computer vision, ECCV, pages 1–2. Prague, 2004.
- A view synthesis-based 360° vr caching system over mec-enabled c-ran. IEEE Transactions on Circuits and Systems for Video Technology, 30(10):3843–3855, 2019.
- Scannerf: a scalable benchmark for neural radiance fields. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 816–825, 2023.
- Volume rendering. ACM Siggraph Computer Graphics, 22(4):65–74, 1988.
- Accelerated generative models for 3d point cloud data. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 5497–5505, 2016.
- Taming transformers for high-resolution image synthesis. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 12873–12883, 2021.
- Unified implicit neural stylization. In European Conference on Computer Vision, pages 636–654. Springer, 2022a.
- Nerf-sos: Any-view self-supervised object segmentation on complex scenes. arXiv preprint arXiv:2209.08776, 2022b.
- The lottery ticket hypothesis: Finding sparse, trainable neural networks. arXiv preprint arXiv:1803.03635, 2018.
- Linear mode connectivity and the lottery ticket hypothesis. In International Conference on Machine Learning, pages 3259–3269. PMLR, 2020.
- Plenoxels: Radiance Fields Without Neural Networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5501–5510, 2022.
- Fastnerf: High-fidelity neural rendering at 200fps. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 14346–14355, 2021.
- Robert Gray. Vector quantization. IEEE Assp Magazine, 1(2):4–29, 1984.
- Vector quantized diffusion model for text-to-image synthesis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10696–10706, 2022.
- Compact neural volumetric video representations with dynamic codebooks. In Thirty-seventh Conference on Neural Information Processing Systems, 2023.
- Learning both weights and connections for efficient neural network. Advances in neural information processing systems, 28, 2015.
- Soft filter pruning for accelerating deep convolutional neural networks. arXiv preprint arXiv:1808.06866, 2018.
- Baking neural radiance fields for real-time view synthesis. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 5875–5884, 2021.
- Knowledge transfer via distillation of activation boundaries formed by hidden neurons. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 3779–3787, 2019.
- Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531, 2015.
- EfficientNeRF Efficient Neural Radiance Fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12902–12911, 2022.
- 3d gaussian splatting for real-time radiance field rendering. ACM Transactions on Graphics (ToG), 42(4):1–14, 2023.
- Flexible techniques for differentiable rendering with 3d gaussians. arXiv preprint arXiv:2308.14737, 2023.
- Tanks and temples: Benchmarking large-scale scene reconstruction. ACM Transactions on Graphics (ToG), 36(4):1–13, 2017.
- Decomposing nerf for editing via feature field distillation. Advances in Neural Information Processing Systems, 35:23311–23330, 2022.
- Point-based neural rendering with per-view optimization. In Computer Graphics Forum, pages 29–43. Wiley Online Library, 2021.
- Optimal brain damage. Advances in neural information processing systems, 2, 1989.
- Snip: Single-shot network pruning based on connection sensitivity. arXiv preprint arXiv:1810.02340, 2018.
- Compressing volumetric radiance fields to 1 mb. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4222–4231, 2023.
- Neural Sparse Voxel Fields. Advances in Neural Information Processing Systems, 33:15651–15663, 2020.
- Knowledge distillation via instance relationship graph. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7096–7104, 2019.
- Petr Man. Generating and real-time rendering of clouds. In Central European seminar on computer graphics. Citeseer Castá-Papiernicka, Slovakia, 2006.
- Discrete representations strengthen vision transformer robustness. arXiv preprint arXiv:2111.10493, 2021.
- Nerf: Representing Scenes As Neural Radiance Fields for View Synthesis. Communications of the ACM, 65(1):99–106, 2021.
- MPEGGroup. MPEG Point Cloud Compression - TMC13. https://github.com/MPEGGroup/mpeg-pcc-tmc13, 2022.
- Instant Neural Graphics Primitives With a Multiresolution Hash Encoding. ACM Transactions on Graphics (TOG), 41:1 – 15, 2022.
- Splatting with shadows. In Volume Graphics 2001: Proceedings of the Joint IEEE TCVG and Eurographics Workshop in Stony Brook, New York, USA, June 21–22, 2001, pages 35–49. Springer, 2001.
- Dreamfusion: Text-to-3d using 2d diffusion. arXiv preprint arXiv:2209.14988, 2022.
- Kilonerf: Speeding up neural radiance fields with thousands of tiny mlps. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 14335–14345, 2021.
- Fitnets: Hints for thin deep nets. arXiv preprint arXiv:1412.6550, 2014.
- Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108, 2019.
- Voxgraf: Fast 3d-Aware Image Synthesis With Sparse Voxel Grids. ArXiv Preprint ArXiv:2206.07695, 2022.
- Emerging mpeg standards for point cloud compression. IEEE Journal on Emerging and Selected Topics in Circuits and Systems, 9(1):133–148, 2018.
- Is label smoothing truly incompatible with knowledge distillation: An empirical study. arXiv preprint arXiv:2104.00676, 2021.
- Light Field Networks: Neural Scene Representations With Single-Evaluation Rendering. Advances in Neural Information Processing Systems, 34, 2021.
- Direct Voxel Grid Optimization: Super-Fast Convergence for Radiance Fields Reconstruction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5459–5469, 2022.
- Variable bitrate neural fields. In ACM SIGGRAPH 2022 Conference Proceedings, pages 1–9, 2022.
- Contrastive representation distillation. arXiv preprint arXiv:1910.10699, 2019.
- Neural discrete representation learning. Advances in neural information processing systems, 30, 2017.
- Collaborative distillation for ultra-resolution universal style transfer. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1860–1869, 2020.
- R2l: Distilling neural radiance field to neural light field for efficient novel view synthesis. In European Conference on Computer Vision, pages 612–629. Springer, 2022a.
- Multiscale point cloud geometry compression. In 2021 Data Compression Conference (DCC), pages 73–82. IEEE, 2021.
- Knowledge distillation and student-teacher learning for visual intelligence: A review and new outlooks. IEEE transactions on pattern analysis and machine intelligence, 44(6):3048–3068, 2021.
- Fourier PlenOctrees for Dynamic Radiance Field Rendering in Real-Time. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13524–13534, 2022b.
- Mars: An instance-aware, modular and realistic simulator for autonomous driving. arXiv preprint arXiv:2307.15058, 2023.
- Neurallift-360: Lifting an in-the-wild 2d photo to a 3d object with 360 {{\{{\\\backslash\deg}}\}} views. arXiv preprint arXiv:2211.16431, 2022.
- Differentiable surface splatting for point-based geometry processing. ACM Transactions on Graphics (TOG), 38(6):1–14, 2019.
- Plenoctrees for real-time rendering of neural radiance fields. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 5752–5761, 2021.
- Plenoxels: Radiance Fields Without Neural Networks. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 5491–5500, 2022.
- Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer. arXiv preprint arXiv:1612.03928, 2016.
- The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 586–595, 2018.
- Tinynerf: Towards 100 x compression of voxel radiance fields. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 3588–3596, 2023.
- Stereo magnification: Learning view synthesis using multiplane images. arXiv preprint arXiv:1805.09817, 2018.
- Zhiwen Fan (52 papers)
- Kevin Wang (41 papers)
- Kairun Wen (4 papers)
- Zehao Zhu (9 papers)
- Dejia Xu (37 papers)
- Zhangyang Wang (375 papers)