Object-Centric 2D Gaussian Splatting: Background Removal and Occlusion-Aware Pruning for Compact Object Models

Published 14 Jan 2025 in cs.CV | (2501.08174v2)

Abstract: Current Gaussian Splatting approaches are effective for reconstructing entire scenes but lack the option to target specific objects, making them computationally expensive and unsuitable for object-specific applications. We propose a novel approach that leverages object masks to enable targeted reconstruction, resulting in object-centric models. Additionally, we introduce an occlusion-aware pruning strategy to minimize the number of Gaussians without compromising quality. Our method reconstructs compact object models, yielding object-centric Gaussian and mesh representations that are up to 96% smaller and up to 71% faster to train compared to the baseline while retaining competitive quality. These representations are immediately usable for downstream applications such as appearance editing and physics simulation without additional processing.

Abstract PDF Upgrade to Chat

Summary

The paper introduces a novel object-centric 2D Gaussian Splatting framework that isolates target objects using segmentation masks.
The paper implements an occlusion-aware Gaussian pruning strategy that reduces model sizes up to 96% and cuts training time by up to 71%.
The paper demonstrates that targeted model building enhances computational efficiency while preserving reconstruction quality in benchmark datasets like DTU and Mip-NeRF360.

Object-Centric 2D Gaussian Splatting: Background Removal and Occlusion-Aware Pruning for Compact Object Models

The study presents a novel approach aimed at enhancing the efficiency and specificity of Gaussian Splatting techniques in the creation of object-centric models. The proposed method addresses the computational inefficiencies inherent in current Gaussian Splatting methodologies, which tend to construct comprehensive scene models without isolating specific objects of interest. Through the implementation of object-centric 2D Gaussian Splatting, the authors introduce a framework that significantly trims down the model size and accelerates training time.

Methodological Innovations

This paper introduces two primary innovations:

Object Mask-Guided Background Removal: By leveraging segmentation masks, the method effectively isolates the target object from the background. The object-centric masking strategy allows the model to focus computational resources on the reconstruction of the desired object, rather than the entire scene. This is achieved through a novel background loss mechanism that compels the learning algorithm to emphasize the object of interest. Such segmentation constraints reduce computational overhead and allow for swifter convergence by eliminating unnecessary data pertaining to non-target regions.
Occlusion-Aware Gaussian Pruning Strategy: The paper proposes a pruning technique that removes occluded Gaussians, or those that do not contribute significantly to the rendered images. This strategy further condenses the model, yielding a representation that maintains quality but is significantly smaller in size. By employing a technique that monitors and prunes non-essential Gaussians, the resulting models are shown to be up to 96% smaller with a training time reduction of up to 71% compared to existing methods.

Quantitative Results

The efficacy of the presented methods is substantiated with comparisons across benchmark datasets, including DTU and Mip-NeRF360, showing competitive reconstruction quality with significantly reduced model sizes and training durations. Table results indicate that the object-centric approach provides efficiency gains without a major sacrifice in the quality of the reconstructed mesh. The study showcases effective handling of segmentation masks through SAM~2, although limitations are acknowledged in scenarios with thin structures or opaque complexities.

Implications and Future Directions

The implications of this work are substantial for applications in which specific object editing and simulations are critical. The reduced model sizes lend themselves to faster and more manageable workflows, rendering the output suitable for direct use in appearance editing and physics simulations, which traditionally require mesh-based representations. From a theoretical perspective, this approach could catalyze further work in targeted splatting techniques, where precision and resource allocation are optimized.

Future research could explore integration with other forms of radiance fields or advanced segmentation algorithms to improve robustness in mask generation, especially in challenging contexts like overlapping objects or semi-transparent textures. Additionally, expanding these methodologies to 3DGS frameworks could widen the applicability across various domains of computer graphics and vision, potentially bridging existing gaps in real-time rendering and scene manipulation workflows.

The findings and methodologies presented in this paper stand as significant contributions toward more efficient and specialized handling of scenes in computational rendering processes. The innovations posit a crucial step towards focused and resource-conscious model building in augmented reality, virtual reality, and related fields.