Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

LightGaussian: Unbounded 3D Gaussian Compression with 15x Reduction and 200+ FPS (2311.17245v6)

Published 28 Nov 2023 in cs.CV

Abstract: Recent advances in real-time neural rendering using point-based techniques have enabled broader adoption of 3D representations. However, foundational approaches like 3D Gaussian Splatting impose substantial storage overhead, as Structure-from-Motion (SfM) points can grow to millions, often requiring gigabyte-level disk space for a single unbounded scene. This growth presents scalability challenges and hinders splatting efficiency. To address this, we introduce LightGaussian, a method for transforming 3D Gaussians into a more compact format. Inspired by Network Pruning, LightGaussian identifies Gaussians with minimal global significance on scene reconstruction, and applies a pruning and recovery process to reduce redundancy while preserving visual quality. Knowledge distillation and pseudo-view augmentation then transfer spherical harmonic coefficients to a lower degree, yielding compact representations. Gaussian Vector Quantization, based on each Gaussian's global significance, further lowers bitwidth with minimal accuracy loss. LightGaussian achieves an average 15x compression rate while boosting FPS from 144 to 237 within the 3D-GS framework, enabling efficient complex scene representation on the Mip-NeRF 360 and Tank & Temple datasets. The proposed Gaussian pruning approach is also adaptable to other 3D representations (e.g., Scaffold-GS), demonstrating strong generalization capabilities.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (73)
  1. Structured pruning of deep convolutional neural networks. ACM Journal on Emerging Technologies in Computing Systems (JETC), 13(3):1–18, 2017.
  2. Learning Neural Light Fields With Ray-Space Embedding Networks. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 19787–19797, 2022.
  3. Do deep nets really need to be deep? Advances in neural information processing systems, 27, 2014.
  4. Mip-Nerf: A Multiscale Representation for Anti-Aliasing Neural Radiance Fields. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 5855–5864, 2021.
  5. Mip-NeRF 360: Unbounded Anti-Aliased Neural Radiance Fields. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 5460–5469, 2022a.
  6. Mip-nerf 360: Unbounded anti-aliased neural radiance fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5470–5479, 2022b.
  7. James F Blinn. A generalization of algebraic surface drawing. ACM transactions on graphics (TOG), 1(3):235–256, 1982.
  8. Model compression. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 535–541, 2006.
  9. Segment anything in 3d with nerfs. arXiv preprint arXiv:2304.12308, 2023.
  10. Tensorf: Tensorial radiance fields. In European Conference on Computer Vision, pages 333–350. Springer, 2022.
  11. Visual categorization with bags of keypoints. In Workshop on statistical learning in computer vision, ECCV, pages 1–2. Prague, 2004.
  12. A view synthesis-based 360° vr caching system over mec-enabled c-ran. IEEE Transactions on Circuits and Systems for Video Technology, 30(10):3843–3855, 2019.
  13. Scannerf: a scalable benchmark for neural radiance fields. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 816–825, 2023.
  14. Volume rendering. ACM Siggraph Computer Graphics, 22(4):65–74, 1988.
  15. Accelerated generative models for 3d point cloud data. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 5497–5505, 2016.
  16. Taming transformers for high-resolution image synthesis. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 12873–12883, 2021.
  17. Unified implicit neural stylization. In European Conference on Computer Vision, pages 636–654. Springer, 2022a.
  18. Nerf-sos: Any-view self-supervised object segmentation on complex scenes. arXiv preprint arXiv:2209.08776, 2022b.
  19. The lottery ticket hypothesis: Finding sparse, trainable neural networks. arXiv preprint arXiv:1803.03635, 2018.
  20. Linear mode connectivity and the lottery ticket hypothesis. In International Conference on Machine Learning, pages 3259–3269. PMLR, 2020.
  21. Plenoxels: Radiance Fields Without Neural Networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5501–5510, 2022.
  22. Fastnerf: High-fidelity neural rendering at 200fps. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 14346–14355, 2021.
  23. Robert Gray. Vector quantization. IEEE Assp Magazine, 1(2):4–29, 1984.
  24. Vector quantized diffusion model for text-to-image synthesis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10696–10706, 2022.
  25. Compact neural volumetric video representations with dynamic codebooks. In Thirty-seventh Conference on Neural Information Processing Systems, 2023.
  26. Learning both weights and connections for efficient neural network. Advances in neural information processing systems, 28, 2015.
  27. Soft filter pruning for accelerating deep convolutional neural networks. arXiv preprint arXiv:1808.06866, 2018.
  28. Baking neural radiance fields for real-time view synthesis. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 5875–5884, 2021.
  29. Knowledge transfer via distillation of activation boundaries formed by hidden neurons. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 3779–3787, 2019.
  30. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531, 2015.
  31. EfficientNeRF Efficient Neural Radiance Fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12902–12911, 2022.
  32. 3d gaussian splatting for real-time radiance field rendering. ACM Transactions on Graphics (ToG), 42(4):1–14, 2023.
  33. Flexible techniques for differentiable rendering with 3d gaussians. arXiv preprint arXiv:2308.14737, 2023.
  34. Tanks and temples: Benchmarking large-scale scene reconstruction. ACM Transactions on Graphics (ToG), 36(4):1–13, 2017.
  35. Decomposing nerf for editing via feature field distillation. Advances in Neural Information Processing Systems, 35:23311–23330, 2022.
  36. Point-based neural rendering with per-view optimization. In Computer Graphics Forum, pages 29–43. Wiley Online Library, 2021.
  37. Optimal brain damage. Advances in neural information processing systems, 2, 1989.
  38. Snip: Single-shot network pruning based on connection sensitivity. arXiv preprint arXiv:1810.02340, 2018.
  39. Compressing volumetric radiance fields to 1 mb. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4222–4231, 2023.
  40. Neural Sparse Voxel Fields. Advances in Neural Information Processing Systems, 33:15651–15663, 2020.
  41. Knowledge distillation via instance relationship graph. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7096–7104, 2019.
  42. Petr Man. Generating and real-time rendering of clouds. In Central European seminar on computer graphics. Citeseer Castá-Papiernicka, Slovakia, 2006.
  43. Discrete representations strengthen vision transformer robustness. arXiv preprint arXiv:2111.10493, 2021.
  44. Nerf: Representing Scenes As Neural Radiance Fields for View Synthesis. Communications of the ACM, 65(1):99–106, 2021.
  45. MPEGGroup. MPEG Point Cloud Compression - TMC13. https://github.com/MPEGGroup/mpeg-pcc-tmc13, 2022.
  46. Instant Neural Graphics Primitives With a Multiresolution Hash Encoding. ACM Transactions on Graphics (TOG), 41:1 – 15, 2022.
  47. Splatting with shadows. In Volume Graphics 2001: Proceedings of the Joint IEEE TCVG and Eurographics Workshop in Stony Brook, New York, USA, June 21–22, 2001, pages 35–49. Springer, 2001.
  48. Dreamfusion: Text-to-3d using 2d diffusion. arXiv preprint arXiv:2209.14988, 2022.
  49. Kilonerf: Speeding up neural radiance fields with thousands of tiny mlps. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 14335–14345, 2021.
  50. Fitnets: Hints for thin deep nets. arXiv preprint arXiv:1412.6550, 2014.
  51. Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108, 2019.
  52. Voxgraf: Fast 3d-Aware Image Synthesis With Sparse Voxel Grids. ArXiv Preprint ArXiv:2206.07695, 2022.
  53. Emerging mpeg standards for point cloud compression. IEEE Journal on Emerging and Selected Topics in Circuits and Systems, 9(1):133–148, 2018.
  54. Is label smoothing truly incompatible with knowledge distillation: An empirical study. arXiv preprint arXiv:2104.00676, 2021.
  55. Light Field Networks: Neural Scene Representations With Single-Evaluation Rendering. Advances in Neural Information Processing Systems, 34, 2021.
  56. Direct Voxel Grid Optimization: Super-Fast Convergence for Radiance Fields Reconstruction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5459–5469, 2022.
  57. Variable bitrate neural fields. In ACM SIGGRAPH 2022 Conference Proceedings, pages 1–9, 2022.
  58. Contrastive representation distillation. arXiv preprint arXiv:1910.10699, 2019.
  59. Neural discrete representation learning. Advances in neural information processing systems, 30, 2017.
  60. Collaborative distillation for ultra-resolution universal style transfer. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1860–1869, 2020.
  61. R2l: Distilling neural radiance field to neural light field for efficient novel view synthesis. In European Conference on Computer Vision, pages 612–629. Springer, 2022a.
  62. Multiscale point cloud geometry compression. In 2021 Data Compression Conference (DCC), pages 73–82. IEEE, 2021.
  63. Knowledge distillation and student-teacher learning for visual intelligence: A review and new outlooks. IEEE transactions on pattern analysis and machine intelligence, 44(6):3048–3068, 2021.
  64. Fourier PlenOctrees for Dynamic Radiance Field Rendering in Real-Time. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13524–13534, 2022b.
  65. Mars: An instance-aware, modular and realistic simulator for autonomous driving. arXiv preprint arXiv:2307.15058, 2023.
  66. Neurallift-360: Lifting an in-the-wild 2d photo to a 3d object with 360 {{\{{\\\backslash\deg}}\}} views. arXiv preprint arXiv:2211.16431, 2022.
  67. Differentiable surface splatting for point-based geometry processing. ACM Transactions on Graphics (TOG), 38(6):1–14, 2019.
  68. Plenoctrees for real-time rendering of neural radiance fields. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 5752–5761, 2021.
  69. Plenoxels: Radiance Fields Without Neural Networks. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 5491–5500, 2022.
  70. Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer. arXiv preprint arXiv:1612.03928, 2016.
  71. The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 586–595, 2018.
  72. Tinynerf: Towards 100 x compression of voxel radiance fields. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 3588–3596, 2023.
  73. Stereo magnification: Learning view synthesis using multiplane images. arXiv preprint arXiv:1805.09817, 2018.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Zhiwen Fan (52 papers)
  2. Kevin Wang (41 papers)
  3. Kairun Wen (4 papers)
  4. Zehao Zhu (9 papers)
  5. Dejia Xu (37 papers)
  6. Zhangyang Wang (375 papers)
Citations (93)

Summary

LightGaussian: Unbounded 3D Gaussian Compression with 15x Reduction and 200+ FPS

The paper introduces LightGaussian, a novel approach to efficiently compress 3D Gaussian representations employed in neural rendering. This method addresses the substantial storage and scalability issues inherent in foundational techniques like 3D Gaussian Splatting. LightGaussian emerges as a significant advancement in the field, providing a 15x reduction in storage and enhancing frames per second (FPS) from 139 to 215.

Key Contributions and Methodology

The primary innovation of LightGaussian is its application of network pruning concepts to 3D Gaussians. By identifying and eliminating Gaussians deemed insignificant for scene reconstruction, the method significantly reduces redundancy without compromising visual fidelity. This pruning uses a rigorous criterion based on global significance scores, which take into account Gaussian opacity and volume.

Subsequently, the method employs a novel approach to compress the Spherical Harmonics (SH) representation. Through a distillation process augmented by pseudo-view synthesis, it efficiently transfers knowledge from high- to lower-degree SHs, maintaining the visual effects crucial for scene appearance.

Additionally, a hybrid VecTree Quantization scheme is introduced. This technique quantizes attributes to lower bitwidths while preserving accuracy, further amplifying storage efficiency.

Numerical Results

The numerical results underscore the efficacy of LightGaussian. In the experimental evaluations using Mip-NeRF 360 and Tank {content} Temple datasets, the method achieves substantial improvements in storage efficiency and rendering speed. Specifically, the model size reduces from 727MB to 42MB, and rendering speed increases to over 200 FPS with only a minimal decrease in SSIM (0.013).

Implications and Future Directions

The implications of this research are both practical and theoretical. Practically, LightGaussian enables the efficient deployment of large-scale 3D scenes in applications like virtual and augmented reality, autonomous driving, and digital twins. Theoretically, it presents a compelling case for adopting pruning and knowledge distillation techniques in 3D neural representations.

Future work could explore extending these techniques to other forms of 3D representations or further optimizing the hybrid quantization strategies. Enhancements in compression technologies and learning-based compression algorithms could also be areas of fruitful investigation.

Conclusion

The development of LightGaussian offers a significant contribution to the field of neural rendering, providing a scalable and efficient solution to the challenges posed by traditional 3D Gaussian methods. Its balanced approach to compression without substantial loss of quality sets a new benchmark for future research endeavors in this area.

X Twitter Logo Streamline Icon: https://streamlinehq.com