DNGaussian: Optimizing Sparse-View 3D Gaussian Radiance Fields with Global-Local Depth Normalization (2403.06912v3)

Published 11 Mar 2024 in cs.CV

Abstract: Radiance fields have demonstrated impressive performance in synthesizing novel views from sparse input views, yet prevailing methods suffer from high training costs and slow inference speed. This paper introduces DNGaussian, a depth-regularized framework based on 3D Gaussian radiance fields, offering real-time and high-quality few-shot novel view synthesis at low costs. Our motivation stems from the highly efficient representation and surprising quality of the recent 3D Gaussian Splatting, despite it will encounter a geometry degradation when input views decrease. In the Gaussian radiance fields, we find this degradation in scene geometry primarily lined to the positioning of Gaussian primitives and can be mitigated by depth constraint. Consequently, we propose a Hard and Soft Depth Regularization to restore accurate scene geometry under coarse monocular depth supervision while maintaining a fine-grained color appearance. To further refine detailed geometry reshaping, we introduce Global-Local Depth Normalization, enhancing the focus on small local depth changes. Extensive experiments on LLFF, DTU, and Blender datasets demonstrate that DNGaussian outperforms state-of-the-art methods, achieving comparable or better results with significantly reduced memory cost, a $25 \times$ reduction in training time, and over $3000 \times$ faster rendering speed.

References (60)

Citations (53)

View on Semantic Scholar

Summary

The paper introduces a dual depth regularization approach that accurately positions Gaussian primitives while preserving visual detail.
It integrates global-local depth normalization in loss functions to capture both fine-grained local depth variations and overall spatial coherence.
Experimental results demonstrate a 25× reduction in training time and over 300 FPS rendering, setting a new benchmark for novel view synthesis.

An Expert Review of "DNGaussian: Optimizing Sparse-View 3D Gaussian Radiance Fields with Global-Local Depth Normalization"

The paper "DNGaussian: Optimizing Sparse-View 3D Gaussian Radiance Fields with Global-Local Depth Normalization" introduces an innovative approach for enhancing the efficiency and quality of novel view synthesis using sparse input views. This is achieved through a framework built upon 3D Gaussian radiance fields employing depth normalization strategies. The framework, named DNGaussian, contributes significantly to the balance of real-time rendering speed and quality in image synthesis tasks.

Key Contributions

In the context of 3D vision and neural rendering, the paper addresses the computational intensity and time constraints posed by existing radiance fields, particularly neural radiance fields (NeRFs). Let us distill the critical aspects of DNGaussian as presented:

Hard and Soft Depth Regularization: The paper introduces a dual approach to depth regularization, distinguishing between the solid spatial constraints imposing "hard depth" and the subtle opacity adjustments facilitated by "soft depth." This differentiation caters to more precise Gaussian primitive positioning while preserving visual detail integrity.
Global-Local Depth Normalization: By incorporating both global and local depth normalization within the loss functions, DNGaussian refines the model's sensitivity to fine-grained local depth variations without losing sight of holistic spatial coherence. This technique leverages patch-wise localization alongside global consistency.
Efficiency in Real-Time Rendering: Notably, the framework achieves substantial reductions in memory usage and training time, quantified as a $25\times$ reduction in training duration and a remarkable rendering speed exceeding 300 frames per second (FPS), emphasizing its potential applicability in dynamic image synthesis environments.

Experimental Validation

The authors substantiate DNGaussian's performance through comprehensive experiments on well-acknowledged datasets like LLFF, DTU, and Blender. The results reveal that DNGaussian either parallels or surpasses state-of-the-art methods with significant reductions in computational overhead, verified by benchmarks such as PSNR, LPIPS, and SSIM. The method's proficiency in preserving high-directional detail at reduced input views is highlighted as a distinguishing trait.

Implications and Future Directions

The introduction of DNGaussian proposes an important stride towards scalable and practical neural rendering solutions. The ability to efficiently optimize 3D Gaussian radiance fields can influence expansive areas including augmented reality, virtual reality, and real-time graphics where resource availability varies.

Looking ahead, avenues for further research could involve investigating the integration of additional geometric priors to address sparse-view density variations, enhancing robustness towards reflective or specular surfaces, or refining the framework's scalability to more diverse and larger scenes. The potential cross-compatibility with emerging rendering paradigms could facilitate broader adoption across AI-driven visualization fields.

In summary, the paper presents a well-rounded, technically robust method for sparse-view novel view synthesis, marking a critical development in overcoming the impedance of training costs and inefficiencies inherent to prior art solutions in neural rendering.

PDF Markdown

Tweets

https://twitter.com/janusch_patas/status/1767429149788922304

https://twitter.com/zhenjun_zhao/status/1769652500590891347

https://twitter.com/arxivsanitybot/status/1767732571021140459