GS2Mesh: Surface Reconstruction from Gaussian Splatting via Novel Stereo Views (2404.01810v2)

Published 2 Apr 2024 in cs.CV

Abstract: Recently, 3D Gaussian Splatting (3DGS) has emerged as an efficient approach for accurately representing scenes. However, despite its superior novel view synthesis capabilities, extracting the geometry of the scene directly from the Gaussian properties remains a challenge, as those are optimized based on a photometric loss. While some concurrent models have tried adding geometric constraints during the Gaussian optimization process, they still produce noisy, unrealistic surfaces. We propose a novel approach for bridging the gap between the noisy 3DGS representation and the smooth 3D mesh representation, by injecting real-world knowledge into the depth extraction process. Instead of extracting the geometry of the scene directly from the Gaussian properties, we instead extract the geometry through a pre-trained stereo-matching model. We render stereo-aligned pairs of images corresponding to the original training poses, feed the pairs into a stereo model to get a depth profile, and finally fuse all of the profiles together to get a single mesh. The resulting reconstruction is smoother, more accurate and shows more intricate details compared to other methods for surface reconstruction from Gaussian Splatting, while only requiring a small overhead on top of the fairly short 3DGS optimization process. We performed extensive testing of the proposed method on in-the-wild scenes, obtained using a smartphone, showcasing its superior reconstruction abilities. Additionally, we tested the method on the Tanks and Temples and DTU benchmarks, achieving state-of-the-art results.

Citations (16)

View on Semantic Scholar

Summary

The paper proposes synthesizing stereo image pairs from Gaussian splatting to extract depth data for geometrically consistent surface reconstruction.
It employs TSDF integration and semi-automatic segmentation to generate smooth and detailed 3D meshes.
Empirical tests on Tanks and Temples benchmarks confirm its superior accuracy and reduced computational cost compared to existing methods.

Surface Reconstruction from Gaussian Splatting through Stereo View Synthesis

Overview

Researchers have proposed a novel method for surface reconstruction from Gaussian splatting models by exploiting the capacity for high-quality novel-view synthesis inherent in such models. Unlike existing strategies that attempt to align Gaussian elements spatially to build a surface directly, this approach synthesizes stereo image pairs from novel viewpoints and extracts depth information via stereo matching. This depth data, integrated across views, forms a geometrically consistent surface representation. This methodology not only improves reconstruction fidelity and detail over current methods but does so with significantly reduced computational demand.

Surface Reconstruction Challenge in Gaussian Splatting

Gaussian splatting models (3DGS) optimize a cloud of 3D Gaussians to match input images across varying viewpoints. Despite their effectiveness in novel viewpoint synthesis, leveraging the spatial information of Gaussian elements for direct surface reconstruction presents substantial challenges. The disconnection between the Gaussian elements' optimized positions for image matching and their utility in forming a coherent surface structure leads to inaccurate and noisy reconstructions.

Our Novel Approach

The proposed methodology introduces an innovative pipeline that leverages 3DGS's novel-view synthesis in stereo calibration form to generate depth maps, which are then combined to form a refined, geometrically consistent surface mesh. This process involves:

Capturing scenes using 3DGS to enable the generation of synthetic stereo image pairs.
Applying stereo matching algorithms to these image pairs to derive depth information.
Integrating collected depth data using Truncated Signed Distance Function (TSDF) techniques to produce a unified and smooth surface mesh.

Additionally, the paper describes a semi-automatic approach for extracting specific objects within scenes for targeted reconstruction by combining depth maps with segmentation masks.

Empirical Validation

The effectiveness of the proposed method is confirmed through extensive testing across various scenes, including those captured in uncontrolled, "in-the-wild" conditions using standard smartphone cameras. Results on the Tanks and Temples benchmark further validate its superiority in surface reconstruction from Gaussian splatting models over the current leading methods. Notably, the method achieves detailed and accurate reconstructions comparable to the best neural surface reconstruction techniques but with a fraction of the computational expense.

Limitations and Future Directions

While the method marks a significant advancement, it is not without limitations. The reconstruction's fidelity relies heavily on the initial 3DGS scene capture quality. Inaccuracies in the Gaussian splatting model, particularly in less well-defined regions, can propagate errors into the final surface reconstruction. Additionally, the stereo matching process might struggle with transparent surfaces, potentially leading to incomplete or inaccurate reconstructions in these areas.

Future work could explore improving the robustness of the initial 3DGS capture process and enhancing stereo matching algorithms to better handle challenging surface properties like transparency. The fusion of depth data could also benefit from advances in integrating diverse information sources, potentially leading to even more accurate and detailed surface reconstructions.

Conclusion

This research introduces a significant methodological shift in surface reconstruction from Gaussian splatting models, focusing on leveraging novel-view synthesis for depth extraction rather than direct geometric manipulation of Gaussian elements. This strategy not only surpasses existing methods in accuracy and detail but also significantly reduces computational requirements, representing a meaningful advancement in the field of three-dimensional scene reconstruction.

PDF Markdown

Related Papers

Tweets

https://twitter.com/janusch_patas/status/1775382337406095634

https://twitter.com/zhenjun_zhao/status/1775415943700905992

https://twitter.com/_vztu/status/1814788356800454694

https://twitter.com/fly51fly/status/1775647597795090587

https://twitter.com/knishimae0531/status/1775681237631594826

https://twitter.com/knishimae0531/status/1775688797344379082