Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

3DGS-Avatar: Animatable Avatars via Deformable 3D Gaussian Splatting (2312.09228v3)

Published 14 Dec 2023 in cs.CV

Abstract: We introduce an approach that creates animatable human avatars from monocular videos using 3D Gaussian Splatting (3DGS). Existing methods based on neural radiance fields (NeRFs) achieve high-quality novel-view/novel-pose image synthesis but often require days of training, and are extremely slow at inference time. Recently, the community has explored fast grid structures for efficient training of clothed avatars. Albeit being extremely fast at training, these methods can barely achieve an interactive rendering frame rate with around 15 FPS. In this paper, we use 3D Gaussian Splatting and learn a non-rigid deformation network to reconstruct animatable clothed human avatars that can be trained within 30 minutes and rendered at real-time frame rates (50+ FPS). Given the explicit nature of our representation, we further introduce as-isometric-as-possible regularizations on both the Gaussian mean vectors and the covariance matrices, enhancing the generalization of our model on highly articulated unseen poses. Experimental results show that our method achieves comparable and even better performance compared to state-of-the-art approaches on animatable avatar creation from a monocular input, while being 400x and 250x faster in training and inference, respectively.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Zhiyin Qian (2 papers)
  2. Shaofei Wang (30 papers)
  3. Marko Mihajlovic (14 papers)
  4. Andreas Geiger (136 papers)
  5. Siyu Tang (86 papers)
Citations (57)

Summary

  • The paper presents a novel approach using 3D Gaussian Splatting that cuts training times to 30 minutes and achieves over 50 FPS rendering.
  • The paper employs a non-rigid deformation network to robustly handle complex human poses and generate realistic, animatable avatars.
  • The paper applies isometric regularizations on Gaussian parameters to enhance generalization and fidelity across varied dynamic poses.

The paper "3DGS-Avatar: Animatable Avatars via Deformable 3D Gaussian Splatting" presents an innovative approach to creating animatable human avatars using monocular videos. This method employs 3D Gaussian Splatting (3DGS) to address the limitations of existing techniques, particularly those based on neural radiance fields (NeRFs).

Key Contributions and Methodology:

  1. Efficiency Improvements: The proposed method significantly reduces training and inference times compared to state-of-the-art NeRF-based methods. While traditional approaches may take days to train, this paper reports training times of only 30 minutes. Additionally, the method achieves real-time rendering speeds, exceeding 50 frames per second (FPS), a substantial improvement over the 15 FPS typical in previous methods.
  2. Non-Rigid Deformation Network: A crucial component of the approach is a non-rigid deformation network, which facilitates the reconstruction of animatable, clothed human avatars. This network allows for better handling of complex movements and poses.
  3. As-Isometric-As-Possible Regularizations: The authors introduce regularizations on the Gaussian mean vectors and covariance matrices to maintain isometry, which enhances the model's generalization capabilities for highly articulated and unseen poses. This aspect is critical for ensuring the avatars' accuracy and realism in a variety of dynamic scenarios.
  4. Performance: Experimental results indicate that this method not only matches but in some cases surpasses the performance of existing state-of-the-art techniques for animating avatars from monocular inputs. Moreover, it achieves this while being 400x faster in training and 250x faster in inference processes.

Significance and Impact:

The development of 3DGS-Avatar offers a substantial leap forward in the domain of avatar animation from monocular video inputs. By leveraging 3D Gaussian Splatting, the authors effectively address the limitations of NeRFs concerning speed and efficiency, making the process more feasible for applications requiring interactive frame rates. This method holds promise for various applications, including gaming, virtual reality, and film, where real-time performance and high-quality visualization are crucial.

Github Logo Streamline Icon: https://streamlinehq.com
X Twitter Logo Streamline Icon: https://streamlinehq.com