Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

FreGS: 3D Gaussian Splatting with Progressive Frequency Regularization (2403.06908v2)

Published 11 Mar 2024 in cs.CV

Abstract: 3D Gaussian splatting has achieved very impressive performance in real-time novel view synthesis. However, it often suffers from over-reconstruction during Gaussian densification where high-variance image regions are covered by a few large Gaussians only, leading to blur and artifacts in the rendered images. We design a progressive frequency regularization (FreGS) technique to tackle the over-reconstruction issue within the frequency space. Specifically, FreGS performs coarse-to-fine Gaussian densification by exploiting low-to-high frequency components that can be easily extracted with low-pass and high-pass filters in the Fourier space. By minimizing the discrepancy between the frequency spectrum of the rendered image and the corresponding ground truth, it achieves high-quality Gaussian densification and alleviates the over-reconstruction of Gaussian splatting effectively. Experiments over multiple widely adopted benchmarks (e.g., Mip-NeRF360, Tanks-and-Temples and Deep Blending) show that FreGS achieves superior novel view synthesis and outperforms the state-of-the-art consistently.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (39)
  1. Mip-nerf: A multiscale representation for anti-aliasing neural radiance fields. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 5855–5864, 2021.
  2. Mip-nerf 360: Unbounded anti-aliased neural radiance fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5470–5479, 2022.
  3. Nope-nerf: Optimising neural radiance field with no pose prior. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4160–4169, 2023.
  4. Mvsnerf: Fast generalizable radiance field reconstruction from multi-view stereo. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 14124–14133, 2021.
  5. Tensorf: Tensorial radiance fields. In European Conference on Computer Vision, pages 333–350. Springer, 2022.
  6. Neural radiance flow for 4d view synthesis and video processing. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 14324–14334, 2021.
  7. Plenoxels: Radiance fields without neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5501–5510, 2022.
  8. Dynamic view synthesis from dynamic monocular video. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 5712–5721, 2021.
  9. Fastnerf: High-fidelity neural rendering at 200fps. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 14346–14355, 2021.
  10. Ad-nerf: Audio driven neural radiance fields for talking head synthesis. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 5784–5794, 2021.
  11. Multiple view geometry in computer vision. Cambridge university press, 2003.
  12. Deep blending for free-viewpoint image-based rendering. ACM Transactions on Graphics (ToG), 37(6):1–15, 2018.
  13. Baking neural radiance fields for real-time view synthesis. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 5875–5884, 2021.
  14. Escaping plato’s cave: 3d shape from adversarial rendering. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 9984–9993, 2019.
  15. Putting nerf on a diet: Semantically consistent few-shot view synthesis. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 5885–5894, 2021.
  16. 3d gaussian splatting for real-time radiance field rendering. ACM Transactions on Graphics (ToG), 42(4):1–14, 2023.
  17. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
  18. Tanks and temples: Benchmarking large-scale scene reconstruction. ACM Transactions on Graphics (ToG), 36(4):1–13, 2017.
  19. Barf: Bundle-adjusting neural radiance fields. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 5741–5751, 2021.
  20. Gnerf: Gan-based neural radiance field without posed camera. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 6351–6361, 2021.
  21. Nerf: Representing scenes as neural radiance fields for view synthesis. In European conference on computer vision, pages 405–421. Springer, 2020.
  22. Instant neural graphics primitives with a multiresolution hash encoding. arXiv preprint arXiv:2201.05989, 2022.
  23. Regnerf: Regularizing neural radiance fields for view synthesis from sparse inputs. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5480–5490, 2022.
  24. Nerfies: Deformable neural radiance fields. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 5865–5874, 2021.
  25. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, 32, 2019.
  26. Kilonerf: Speeding up neural radiance fields with thousands of tiny mlps. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 14335–14345, 2021.
  27. Free view synthesis. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XIX 16, pages 623–640. Springer, 2020.
  28. Structure-from-motion revisited. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4104–4113, 2016.
  29. Deepvoxels: Learning persistent 3d feature embeddings. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2437–2446, 2019.
  30. Implicit neural representations with periodic activation functions. Advances in neural information processing systems, 33:7462–7473, 2020.
  31. Fourier features let networks learn high frequency functions in low dimensional domains. Advances in Neural Information Processing Systems, 33:7537–7547, 2020.
  32. Deferred neural rendering: Image synthesis using neural textures. Acm Transactions on Graphics (TOG), 38(4):1–12, 2019.
  33. Non-rigid neural radiance fields: Reconstruction and novel view synthesis of a dynamic scene from monocular video. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 12959–12970, 2021.
  34. Hf-neus: Improved surface reconstruction using high-frequency details. Advances in Neural Information Processing Systems, 35:1966–1978, 2022.
  35. Wavenerf: Wavelet-based generalizable neural radiance fields. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 18195–18204, 2023.
  36. Freenerf: Improving few-shot neural rendering with free frequency regularization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8254–8263, 2023.
  37. General neural gauge fields. In The Eleventh International Conference on Learning Representations, 2023.
  38. Vmrf: View matching neural radiance fields. In Proceedings of the 30th ACM International Conference on Multimedia, pages 6579–6587, 2022.
  39. Pose-free neural radiance fields via implicit pose regularization. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 3534–3543, 2023.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Jiahui Zhang (65 papers)
  2. Fangneng Zhan (53 papers)
  3. Muyu Xu (5 papers)
  4. Shijian Lu (151 papers)
  5. Eric Xing (127 papers)
Citations (27)

Summary

  • The paper demonstrates how progressive frequency regularization refines Gaussian densification to address over-reconstruction in 3D Gaussian Splatting.
  • It utilizes a frequency annealing technique that progressively incorporates spectral components to improve geometric fidelity and image detail.
  • Evaluations on benchmarks reveal FreGS achieves superior rendering quality with reduced artifacts in novel view synthesis.

Enhancing 3D Gaussian Splatting with Progressive Frequency Regularization: Introduction to FreGS

Overview

The field of 3D computer vision and novel view synthesis (NVS) has witnessed significant advancements owing to the development of Neural Radiance Fields (NeRF) and its derivatives. Notably, 3D Gaussian Splatting (3D-GS) has emerged as a promising alternative, facilitating real-time NVS with a balance between training efficiency and rendering quality. Despite its advantages, 3D-GS encounters challenges, particularly over-reconstruction, which leads to blur and artifacts in rendered images. Addressing this, we introduce the FreGS framework, which innovatively applies progressive frequency regularization (PFR) in the frequency space to refine Gaussian densification, thereby enhancing NVS performance.

Technical Foundations and Contributions

Preliminaries: The Challenge of Over-Reconstruction in 3D-GS

3D-GS, while efficient and effective in many respects, often suffers from over-reconstruction, where certain image regions get dominated by a few large Gaussians, failing to accurately represent the scene details. This results in blurred and artifact-laden rendered images. Traditionally, this issue has been somewhat challenging to address through spatial domain optimizations alone.

Progressive Frequency Regularization: A Spectral Solution

To tackle the over-reconstruction dilemma, FreGS introduces a PFR approach that operates in the frequency domain. Recognizing that both high and low spectral components play crucial roles in image representation—capturing fine details and larger structures, respectively—FreGS meticulously regularizes these components to improve Gaussian densification progressively.

  • Frequency Annealing Technique: At the core of FreGS is a frequency annealing mechanism that systematically incorporates spectral components from low to high frequencies. This technique enables a more nuanced and effective Gaussian densification process, facilitating the progressive refinement of scene representations.
  • Spectral Regularization Objectives: The framework minimizes discrepancies in both the amplitude and phase components between the spectral representations of rendered and ground-truth images. This dual-focus regularization ensures that both the geometric fidelity and textural nuances of the scene are well captured.

Superior Performance and Future Directions

Evaluations conducted on multiple benchmarks, including indoor and outdoor scenes from Mip-NeRF360 and Tanks-and-Temples, demonstrate FreGS's superior ability in rendering high-quality images with fewer artifacts and enhanced details. By effectively addressing the over-reconstruction issue, FreGS sets new standards in the domain of 3D GS and NVS.

Theoretical and Practical Implications

  • Advancing 3D Gaussians Representation: FreGS not only advances the state of Gaussian splatting techniques by alleviating intrinsic limitations (like over-reconstruction) but also enriches the toolbox for NVS, providing a robust alternative to NeRF-based methods.
  • Spectral Insights into NVS: The success of FreGS underscores the significance of frequency domain analyses in NVS tasks, suggesting that future research might further exploit spectral properties to solve analogous challenges in 3D vision.
  • Applications Beyond NVS: The principles of PFR developed here could have implications beyond NVS, potentially informing the design of algorithms in related areas such as image reconstruction, denoising, and more.

Conclusion

FreGS represents a pivotal step forward in the domain of 3D computer vision, particularly in the efficient and high-quality synthesis of novel views. By pivoting to the frequency domain and employing PFR, FreGS effectively overcomes the challenges of over-reconstruction inherent in 3D-GS, paving the way for more accurate and visually appealing synthetic imagery. As we continue to explore and refine this approach, the potential for even more sophisticated and nuanced scene representations in real-time NVS seems both promising and imminent.