Distractor-free Generalizable 3D Gaussian Splatting

Published 26 Nov 2024 in cs.CV | (2411.17605v2)

Abstract: We present DGGS, a novel framework that addresses the previously unexplored challenge: $\textbf{Distractor-free Generalizable 3D Gaussian Splatting}$ (3DGS). It mitigates 3D inconsistency and training instability caused by distractor data in the cross-scenes generalizable train setting while enabling feedforward inference for 3DGS and distractor masks from references in the unseen scenes. To achieve these objectives, DGGS proposes a scene-agnostic reference-based mask prediction and refinement module during the training phase, effectively eliminating the impact of distractor on training stability. Moreover, we combat distractor-induced artifacts and holes at inference time through a novel two-stage inference framework for references scoring and re-selection, complemented by a distractor pruning mechanism that further removes residual distractor 3DGS-primitive influences. Extensive feedforward experiments on the real and our synthetic data show DGGS's reconstruction capability when dealing with novel distractor scenes. Moreover, our generalizable mask prediction even achieves an accuracy superior to existing scene-specific training methods. Homepage is https://github.com/bbbbby-99/DGGS.

Abstract PDF HTML Upgrade to Chat

Authors (4)

Citations (1)

View on Semantic Scholar

Summary

The paper presents DGGS, integrating distractor management in both training and inference to enhance the stability of 3D Gaussian splatting.
It introduces innovative modules like Reference-based Mask Prediction and Training Views Selection to improve mask precision and training robustness.
The proposed two-stage inference, using Reference Scoring and Distractor Pruning, achieves higher PSNR, SSIM and lower LPIPS on novel scenes.

An Analysis of Distractor-free Generalizable 3D Gaussian Splatting

In recent developments of 3D vision technologies, 3D Gaussian Splatting (3DGS) has emerged as a prominent technique due to its efficient representation and rendering capabilities. This paper introduces a novel framework, Distractor-free Generalizable 3D Gaussian Splatting (DGGS), which aims to enhance the robustness of 3DGS in real-world settings, especially when dealing with distractor-laden data. The authors address two primary objectives: ensuring that generalizable 3DGS remains functional in distractor-rich environments and extending the adaptability of traditional distractor-free 3DGS approaches. Importantly, this paper distinguishes itself by tackling the previously unaddressed challenge of integrating distractor handling into a generalizable framework.

Core Contributions

The paper proposes a comprehensive solution consisting of an advanced training paradigm and a sophisticated inference framework. These are tailored to manage the adverse impacts introduced by distractors in both phases of 3DGS development.

Training Innovations: The authors introduce a Reference-based Mask Prediction and a Mask Refinement module. These components are designed to improve the precision of distractor masks during training by leveraging geometric consistency across different views. Additionally, a Training Views Selection strategy is implemented to maintain optimal view sampling, which enhances the stability of training by minimizing the discrepancies induced by randomly chosen reference-query pairs.
Inference Enhancements: During inference, a two-stage approach is adopted. This includes a Reference Scoring mechanism that assesses a pool of reference images based on minimal distractor influence and a Distractor Pruning module that eliminates distractor effects post-attribute decoding. These enhancements directly address the challenge of residual distractor artifacts in rendered query views.

Numerical Results and Claims

The paper substantiates the claims of improved performance with rigorous experimentation across distractor-heavy datasets. The proposed DGGS significantly outperforms existing methods, both pre-trained and re-trained on similar data, showing superior generalization capabilities. Specifically, the DGGS consistently achieves higher PSNR and SSIM scores, along with lower LPIPS values, compared to existing approaches when applied to novel, unseen scenes from the datasets tested.

Theoretical and Practical Implications

Theoretically, the paper advances our understanding of geometric consistency utilization in 3DGS frameworks. By decoupling distractor from disparity-induced errors, the proposed approach enhances mask accuracy without necessitating scene-specific configurations. Practically, the DGGS framework offers a robust solution for 3D reconstruction tasks in everyday applications, where distractor-induced degradation is commonplace. This holds significant implications for mobile-based 3D reconstruction, where environments are rarely controlled and often contain transient objects and noise.

Future Developments and Challenges

The authors recognize certain limitations, such as performance decline under extensive mutual occlusions and increased inference time. These challenges highlight directions for future research, including the potential integration of advanced inpainting models and optimization of computational efficiency. The introduction of DGGS creates a foundational platform for ongoing exploration and solution development in the domain of distractor-free generalizable 3DGS.

In conclusion, this paper presents a significant step forward in overcoming the limitations faced by generalizable 3DGS frameworks when exposed to real-world, distractor-rich datasets. By effectively integrating distractor management into both the training and inference phases, the proposed DGGS framework not only enhances model reliability and accuracy but also extends the practical applicability of 3D Gaussian Splatting in diverse environments.

Markdown Report Issue