Papers

Topics

Authors

Recent

View all

Gemini 2.5 Flash

162 tokens/sec

GPT-4o

7 tokens/sec

Gemini 2.5 Pro Pro

45 tokens/sec

o3 Pro

4 tokens/sec

GPT-4.1 Pro

38 tokens/sec

DeepSeek R1 via Azure Pro

28 tokens/sec

2000 character limit reached

OccGaussian: 3D Gaussian Splatting for Occluded Human Rendering (2404.08449v3)

Published 12 Apr 2024 in cs.CV

Abstract: Rendering dynamic 3D human from monocular videos is crucial for various applications such as virtual reality and digital entertainment. Most methods assume the people is in an unobstructed scene, while various objects may cause the occlusion of body parts in real-life scenarios. Previous method utilizing NeRF for surface rendering to recover the occluded areas, but it requiring more than one day to train and several seconds to render, failing to meet the requirements of real-time interactive applications. To address these issues, we propose OccGaussian based on 3D Gaussian Splatting, which can be trained within 6 minutes and produces high-quality human renderings up to 160 FPS with occluded input. OccGaussian initializes 3D Gaussian distributions in the canonical space, and we perform occlusion feature query at occluded regions, the aggregated pixel-align feature is extracted to compensate for the missing information. Then we use Gaussian Feature MLP to further process the feature along with the occlusion-aware loss functions to better perceive the occluded area. Extensive experiments both in simulated and real-world occlusions, demonstrate that our method achieves comparable or even superior performance compared to the state-of-the-art method. And we improving training and inference speeds by 250x and 800x, respectively. Our code will be available for research purposes.

References (73)

Citations (2)

View on Semantic Scholar

Summary

The paper introduces a novel method that leverages 3D Gaussian splatting to overcome occlusion challenges in human rendering.
It employs 3D Gaussian forward skinning and occlusion feature queries to efficiently capture and enhance missing details in occluded regions.
Experiments demonstrate that OccGaussian achieves up to 160 FPS and 250x faster training, outperforming state-of-the-art methods.

3D Gaussian Splatting for Occluded Human Rendering: A Study on OccGaussian

Introduction

Rendering dynamic 3D humans from monocular videos is crucial for virtual reality and digital entertainment. However, occlusion poses a significant challenge, as conventional methods struggle to maintain high-quality renderings when parts of the human body are obstructed. The recently introduced OccGaussian method addresses these limitations by leveraging 3D Gaussian Splatting, achieving rapid training and real-time rendering while rendering high-quality human figures in occluded scenarios.

Technical Summary

OccGaussian initializes 3D Gaussian distributions in the canonical space and conducts occlusion feature queries in occluded regions. It then utilizes Gaussian Feature MLP to process the aggregated pixel-align features extracted to compensate for missing information. Remarkably, OccGaussian achieves training speeds 250 times faster than its predecessors and can render at up to 160 FPS, an 800 times improvement. This efficiency does not compromise quality, as the method demonstrates comparable or superior performance against state-of-the-art methods.

Methodological Innovations

3D Gaussian Forward Skinning: Adapts the 3D Gaussian Splatting technique for occluded human rendering, leveraging the efficiency of 3DGS while ensuring high-quality renderings of dynamic human figures under occlusion.
Occlusion Feature Query: Implements K-nearest feature query in occluded regions, followed by the extraction of aggregated pixel-align features to effectively utilize local information and compensate for the absence of ground truth in these areas.
Gaussian Feature MLP: Further processes the features of occluded regions, predicting spherical harmonic coefficients and opacity values through MLP, enhancing the rendering quality in occluded areas.

Experimental Insights

The effectiveness of OccGaussian is demonstrated through rigorous experiments on the ZJU-MoCap and OcMotion datasets, showcasing superior performance in rendering quality, training speed, and rendering framerate. The method not only achieves state-of-the-art rendering quality but does so with remarkable improvements in efficiency, making it particularly suitable for real-time applications.

Practical Implications and Future Prospects

OccGaussian represents a significant advancement in the field of 3D human rendering, particularly for scenarios complicated by occlusions. The method's efficiency and quality make it an appealing option for a wide range of applications, from virtual try-on and augmented reality to virtual production in films.

Future research may explore incorporating temporal information to enhance the reconstruction of severely occluded regions, a limitation currently faced by OccGaussian. Additionally, improving the method's robustness to inaccuracies in pose and camera parameters could extend its applicability to in-the-wild videos. The remarkable improvements in efficiency and rendering quality position OccGaussian as a promising avenue for future developments in the field of 3D human rendering.

Conclusion

OccGaussian introduces a novel approach to rendering occluded humans in monocular videos by leveraging 3D Gaussian Splatting. Its efficiency in training and rendering, combined with its ability to produce high-quality renderings in the presence of occlusions, marks a notable advancement in the field. As the method opens new doors for real-time applications and beyond, OccGaussian is poised to drive further innovations in 3D human rendering technology.

PDF Markdown

Tweets

https://twitter.com/janusch_patas/status/1779708151215886539

https://twitter.com/fly51fly/status/1779996183525814606

https://twitter.com/CSVisionPapers/status/1779992347687735706