Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

EAGLES: Efficient Accelerated 3D Gaussians with Lightweight EncodingS (2312.04564v3)

Published 7 Dec 2023 in cs.CV and cs.GR

Abstract: Recently, 3D Gaussian splatting (3D-GS) has gained popularity in novel-view scene synthesis. It addresses the challenges of lengthy training times and slow rendering speeds associated with Neural Radiance Fields (NeRFs). Through rapid, differentiable rasterization of 3D Gaussians, 3D-GS achieves real-time rendering and accelerated training. They, however, demand substantial memory resources for both training and storage, as they require millions of Gaussians in their point cloud representation for each scene. We present a technique utilizing quantized embeddings to significantly reduce per-point memory storage requirements and a coarse-to-fine training strategy for a faster and more stable optimization of the Gaussian point clouds. Our approach develops a pruning stage which results in scene representations with fewer Gaussians, leading to faster training times and rendering speeds for real-time rendering of high resolution scenes. We reduce storage memory by more than an order of magnitude all while preserving the reconstruction quality. We validate the effectiveness of our approach on a variety of datasets and scenes preserving the visual quality while consuming 10-20x lesser memory and faster training/inference speed. Project page and code is available https://efficientgaussian.github.io

Definition Search Book Streamline Icon: https://streamlinehq.com
References (46)
  1. Post-training 4-bit quantization of convolution networks for rapid-deployment. arXiv preprint arXiv:1810.05723, 2018.
  2. Mip-nerf: A multiscale representation for anti-aliasing neural radiance fields. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 5855–5864, 2021.
  3. Mip-nerf 360: Unbounded anti-aliased neural radiance fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5470–5479, 2022.
  4. 3d scene compression through entropy penalized neural representation functions. In 2021 Picture Coding Symposium (PCS), pages 1–5. IEEE, 2021.
  5. Nerv: Neural representations for videos. Advances in Neural Information Processing Systems, 34:21557–21568, 2021.
  6. Compressing neural networks with the hashing trick. In International conference on machine learning, pages 2285–2294. PMLR, 2015.
  7. Compressing convolutional neural networks in the frequency domain. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1475–1484, 2016.
  8. Binaryconnect: Training deep neural networks with binary weights during propagations. In Advances in neural information processing systems, pages 3123–3131, 2015.
  9. Llm. int8 (): 8-bit matrix multiplication for transformers at scale. arXiv preprint arXiv:2208.07339, 2022.
  10. Coin: Compression with implicit neural representations. arXiv preprint arXiv:2103.03123, 2021.
  11. The lottery ticket hypothesis: Finding sparse, trainable neural networks. arXiv preprint arXiv:1803.03635, 2018.
  12. Pruning neural networks at initialization: Why are we missing the mark? arXiv preprint arXiv:2009.08576, 2020.
  13. Plenoxels: Radiance fields without neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5501–5510, 2022.
  14. The lottery ticket hypothesis for object recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 762–771, 2021.
  15. Lilnetx: Lightweight networks with extreme model compression and structured sparsification. arXiv preprint arXiv:2204.02965, 2022.
  16. Shacira: Scalable hash-grid compression for implicit neural representations. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 17513–17524, 2023.
  17. Compressing deep convolutional networks using vector quantization. arXiv preprint arXiv:1412.6115, 2014.
  18. Learning both weights and connections for efficient neural networks. arXiv preprint arXiv:1506.02626, 2015.
  19. Deep blending for free-viewpoint image-based rendering. ACM Transactions on Graphics (ToG), 37(6):1–15, 2018.
  20. The hat matrix in regression and anova. The American Statistician, 32(1):17–22, 1978.
  21. 3d gaussian splatting for real-time radiance field rendering. ACM Transactions on Graphics (ToG), 42(4):1–14, 2023.
  22. Tanks and temples: Benchmarking large-scale scene reconstruction. ACM Transactions on Graphics (ToG), 36(4):1–13, 2017.
  23. Optimal brain damage. In Advances in neural information processing systems, pages 598–605, 1990.
  24. Ternary weight networks. arXiv preprint arXiv:1605.04711, 2016.
  25. Compressing volumetric radiance fields to 1 mb. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4222–4231, 2023.
  26. Learning neural acoustic fields. Advances in Neural Information Processing Systems, 35:3165–3177, 2022.
  27. Nirvana: Neural implicit representations of videos with adaptive networks and autoregressive patch-wise modeling. arXiv preprint arXiv:2212.14593, 2022.
  28. Nerf: Representing scenes as neural radiance fields for view synthesis. In ECCV, 2020.
  29. Instant neural graphics primitives with a multiresolution hash encoding. ACM Transactions on Graphics (ToG), 41(4):1–15, 2022.
  30. Scalable model compression by entropy penalized reparameterization. arXiv preprint arXiv:1906.06624, 2019.
  31. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, 32, 2019.
  32. Xnor-net: Imagenet classification using binary convolutional neural networks. In European conference on computer vision, pages 525–542. Springer, 2016.
  33. Russell Reed. Pruning algorithms-a survey. IEEE transactions on Neural Networks, 4(5):740–747, 1993.
  34. Winning the lottery with continuous sparsification. Advances in Neural Information Processing Systems, 33:11380–11390, 2020.
  35. Robert T Seeley. Spherical harmonics. The American Mathematical Monthly, 73(4P2):115–121, 1966.
  36. Metasdf: Meta-learning signed distance functions. Advances in Neural Information Processing Systems, 33:10136–10147, 2020a.
  37. Implicit neural representations with periodic activation functions. Advances in neural information processing systems, 33:7462–7473, 2020b.
  38. Implicit neural representations for image compression. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXVI, pages 74–91. Springer, 2022.
  39. Neural geometric level of detail: Real-time rendering with implicit 3d shapes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11358–11367, 2021.
  40. Variable bitrate neural fields. In ACM SIGGRAPH 2022 Conference Proceedings, pages 1–9, 2022.
  41. Learned initializations for optimizing coordinate-based neural representations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2846–2855, 2021.
  42. Rtmv: A ray-traced multi-view synthetic dataset for novel view synthesis. IEEE/CVF European Conference on Computer Vision Workshop (Learn3DG ECCVW), 2022, 2022.
  43. Shimon Ullman. The interpretation of structure from motion. Proceedings of the Royal Society of London. Series B. Biological Sciences, 203(1153):405–426, 1979.
  44. Lq-nets: Learned quantization for highly accurate and compact deep neural networks. In Proceedings of the European conference on computer vision (ECCV), pages 365–382, 2018.
  45. Ewa volume splatting. In Proceedings Visualization, 2001. VIS’01., pages 29–538. IEEE, 2001a.
  46. Surface splatting. In Proceedings of the 28th annual conference on Computer graphics and interactive techniques, pages 371–378, 2001b.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Sharath Girish (11 papers)
  2. Kamal Gupta (22 papers)
  3. Abhinav Shrivastava (122 papers)
Citations (40)

Summary

  • The paper introduces a novel framework that uses quantized embeddings to compress Gaussian attributes and achieve over tenfold reduction in memory while preserving scene quality.
  • The paper applies a progressive training schedule that starts at lower resolutions to accelerate optimization and reduce artifacts in full-scale reconstruction.
  • The paper implements controlled densification to strategically add Gaussian points, enabling real-time rendering with performance comparable to state-of-the-art methods.

Background on 3D Scene Representations

3D scene representation is a critical area in computer vision that facilitates the generation of new views of a scene, often from different angles or perspectives not originally captured. Traditionally, this task involves considerable computational resources and storage, making it challenging to implement in real-time applications or on systems with limited memory. Neural Radiance Fields (NeRFs) have set a high standard for quality in scene reconstruction but are known for their demanding resource requirements.

Innovations in Efficient 3D Gaussians

A novel approach known as Efficient Accelerated 3D Gaussians with Lightweight Encoding (EAGLES) aims to mitigate the memory and computation intensity of previous methods. EAGLES leverages quantized embeddings to efficiently reduce memory storage while maintaining reconstruction quality. This approach results in scene representations that are lighter and faster, allowing for real-time rendering of high-resolution scenes with significantly reduced memory footprints.

Key Technical Contributions

To achieve a balance between efficiency and quality, EAGLES introduces several key techniques:

  • Attribute Quantization: By compressing color and rotation attributes of Gaussian points in a scene, EAGLES considerably lowers memory requirements without substantial quality loss. A novel aspect includes the quantization of opacity coefficients, which enhances the optimization process and reduces visual artifacts.
  • Progressive Training: In lieu of starting with full image resolution during training, EAGLES adopts a progressive schedule, beginning with lower resolutions and gradually increasing to the full scale. This strategy not only speeds up training but also reduces the introduction of artifacts during the optimization of Gaussian points.
  • Controlled Densification: A careful management of the frequency at which Gaussian points are added during training (densification) effectively reduces the overall number and therefore storage, without significantly affecting the reconstruction performance.

Evaluation and Implications

Extensive evaluation of EAGLES on various datasets demonstrates comparable performance to state-of-the-art techniques like NeRF and other 3D-GS methods. Additionally, it significantly outperforms these methods in terms of training duration and frame rates during rendering. This performance is achieved with a more than tenfold reduction in memory storage, making EAGLES a highly efficient method for real-time applications and systems with memory constraints.

Conclusion

EAGLES offers an innovative solution to the challenge of real-time, high-quality 3D scene representation in memory-constrained environments. Its mix of quantization, progressive training, and controlled densification makes it a promising tool for real-world use cases that demand both efficiency and visual fidelity.

X Twitter Logo Streamline Icon: https://streamlinehq.com
Youtube Logo Streamline Icon: https://streamlinehq.com