Self-Ensembling Gaussian Splatting for Few-Shot Novel View Synthesis (2411.00144v3)

Published 31 Oct 2024 in cs.CV and cs.GR

Abstract: 3D Gaussian Splatting (3DGS) has demonstrated remarkable effectiveness in novel view synthesis (NVS). However, 3DGS tends to overfit when trained with sparse views, limiting its generalization to novel viewpoints. In this paper, we address this overfitting issue by introducing Self-Ensembling Gaussian Splatting (SE-GS). We achieve self-ensembling by incorporating an uncertainty-aware perturbation strategy during training. A $\mathbf{\Delta}$-model and a $\mathbf{\Sigma}$-model are jointly trained on the available images. The $\mathbf{\Delta}$-model is dynamically perturbed based on rendering uncertainty across training steps, generating diverse perturbed models with negligible computational overhead. Discrepancies between the $\mathbf{\Sigma}$-model and these perturbed models are minimized throughout training, forming a robust ensemble of 3DGS models. This ensemble, represented by the $\mathbf{\Sigma}$-model, is then used to generate novel-view images during inference. Experimental results on the LLFF, Mip-NeRF360, DTU, and MVImgNet datasets demonstrate that our approach enhances NVS quality under few-shot training conditions, outperforming existing state-of-the-art methods. The code is released at: https://sailor-z.github.io/projects/SEGS.html.

References (53)

Summary

The paper presents a novel self-ensembling mechanism that regularizes 3D Gaussian Splatting models, reducing overfitting in few-shot novel view synthesis.
It employs a dual-model system with an uncertainty-aware perturbation strategy that generates diverse temporal samples for more robust training.
Experimental results demonstrate significant improvements in PSNR and perceptual metrics on datasets like LLFF, DTU, and Mip-NeRF360 compared to existing methods.

Self-Ensembling Gaussian Splatting for Few-shot Novel View Synthesis: An In-depth Analysis

The paper "Self-Ensembling Gaussian Splatting for Few-shot Novel View Synthesis" presents a novel approach aimed at enhancing the performance of 3D Gaussian Splatting (3DGS) in the context of sparse-view novel view synthesis (NVS). The core contribution lies in addressing the overfitting issues that arise when 3DGS models are trained with limited training views. The proposed technique introduces self-ensembling Gaussian Splatting (SE-GS), a method that employs regularization to derive robust and generalizable models through a self-ensembling mechanism, fundamentally improving the model's capacity to deliver high-quality NVS from few-shot images.

Key Contributions

The authors address significant limitations in the current state-of-the-art methods in NVS, particularly when handling sparse training views. They introduce SE-GS, which integrates a dual-model system known as the $\Sigma$ -model and $\Delta$ -model. The latter is subjected to an uncertainty-aware perturbing strategy, dynamically generating temporal samples in the Gaussian parameter space. This strategy effectively circumvents the computational inefficiency associated with training multiple models as seen in previous approaches such as CoR-GS.

Methodological Insights

The $\Delta$ -model is designed to represent a temporal sample derived from the Gaussian parameter space at each training step. A unique uncertainty-aware perturbation mechanism is employed, leveraging pseudo views to estimate the reliability of renderings without substantial extra costs. This mechanism involves the dynamic update of image buffers, permitting the computation of uncertainties which guide the perturbation process. Resultantly, the $\Delta$ -model yields diverse temporal samples by perturbing regions with high uncertainty, thus providing a broader exploration of the parameter space.

Conversely, the $\Sigma$ -model serves as the final ensemble model. It is trained with a regularization framework applied across temporal samples drawn from the $\Delta$ -model. This involves minimizing discrepancies between pseudo view renderings from both models, ensuring that the $\Sigma$ -model integrates information from various samples, thereby enhancing its robustness against overfitting concerns.

Experimental Validation

The efficacy of SE-GS is validated across multi-view datasets such as LLFF, DTU, Mip-NeRF360, and MVImgNet, where it consistently outperforms the leading approaches, including both NeRF and other 3DGS based methods. Particularly noteworthy is its superior performance with fewer data views, showcasing its stability and efficacy under scenarios where obtaining an extensive dataset is challenging.

SE-GS demonstrates significant improvements in PSNR and other perceptual metrics compared to alternatives such as FSGS and DNGaussian, which incorporate auxiliary data, or CoR-GS, employing cross-model regularization. The results underline SE-GS's capability to yield visually superior novel views, with sharper and more accurate reproductions of textures and finer details.

Theoretical and Practical Implications

Practically, SE-GS potentially broadens the scope of applications in virtual and augmented reality where high-quality 3D reconstruction from minimal data is a typical constraint. Theoretically, it opens new directions for exploring efficient and effective ensembling techniques within the 3DGS frameworks, demonstrating that model robustness and generalization can be achieved without extensive retraining overheads.

Future Directions

Future research could explore extending the uncertainty-aware mechanism to other neural representation learning scenarios or further optimizing buffer-based strategies for even faster inference speeds. Additionally, investigating the scalability of SE-GS with comprehensive scene hierarchies or increased resolution remains an intriguing line of inquiry.

In conclusion, this work provides a substantial contribution to the landscape of sparse-view novel view synthesis, delivering an adept methodological breakthrough that allows models to combat overfitting effectively. As such, SE-GS represents a significant step forward in the development of robust, high-fidelity NVS under data-constrained conditions.

Related Papers

GitHub

GitHub - sailor-z/SE-GS: The code will be released soon. Stay tuned!

Tweets

https://twitter.com/zhenjun_zhao/status/1853273908193222688