GGAvatar: Reconstructing Garment-Separated 3D Gaussian Splatting Avatars from Monocular Video (2411.09952v1)

Published 15 Nov 2024 in cs.CV, cs.AI, and cs.MM

Abstract: Avatar modelling has broad applications in human animation and virtual try-ons. Recent advancements in this field have focused on high-quality and comprehensive human reconstruction but often overlook the separation of clothing from the body. To bridge this gap, this paper introduces GGAvatar (Garment-separated 3D Gaussian Splatting Avatar), which relies on monocular videos. Through advanced parameterized templates and unique phased training, this model effectively achieves decoupled, editable, and realistic reconstruction of clothed humans. Comparative evaluations with other costly models confirm GGAvatar's superior quality and efficiency in modelling both clothed humans and separable garments. The paper also showcases applications in clothing editing, as illustrated in Figure 1, highlighting the model's benefits and the advantages of effective disentanglement. The code is available at https://github.com/J-X-Chen/GGAvatar/.

Summary

The paper introduces a model that disentangles clothing from body structures using 3D Gaussian Splatting, enabling precise 3D avatar reconstruction.
It employs garment template estimation and a joint training strategy to accurately separate and render garments from human body models.
The method achieves significantly faster training speeds and superior image quality compared to NeRF-based approaches, supporting applications like virtual try-ons and avatar editing.

GGAvatar: Reconstructing Garment-Separated 3D Gaussian Splatting Avatars from Monocular Video

This paper introduces GGAvatar, a model for constructing garment-separated 3D avatars from monocular video using 3D Gaussian Splatting (3DGS) techniques. The work addresses the challenge of disentangling clothing from human body models in virtual avatars, a task often overlooked in prior formulations that focus extensively on human shape reconstruction but neglect the distinct modeling of clothing.

Overview and Methodology

The GGAvatar model uses advanced parameterized templates and phased training to separate and model garments from 3D human avatars efficiently. The pipeline involves several components: garment template estimation, Gaussian deformation, and rendering using the 3DGS method. Garment templates are initialized to address modeling irregularities of clothes, emphasizing accurate retargeting using templates stored in a canonical space. These templates are then processed to Gaussian representations, allowing flexible manipulation and robust construction of avatars from monocular video inputs.

One significant point in this method is the joint training strategy, involving both isolation and joint training phases. This technique prevents point set intersections in the learning process, maintaining the independence and integrity of clothing from the human body structure.

Performance and Comparisons

GGAvatar demonstrates superior performance in terms of efficiency and quality compared to existing models. Quantitatively, on image quality metrics such as PSNR, SSIM, and LPIPS, the GGAvatar model shows improvements over traditional NeRF-based and some 3DGS-based approaches, such as Neural Body and InstantAvatar. notably, it showcases significantly faster training speeds, reportedly hundreds of times quicker than NeRF counterparts, a remarkable improvement in computational efficiency. Additionally, the proposed model supports various applications, such as novel view synthesis, pose changes, garment transfer, and color editing — significant features for practical applications in virtual reality and digital try-ons.

Contributions and Implications

The contributions documented in this research are multifaceted: GGAvatar achieves high-quality, efficient modeling of clothed humans; introduces a robust solution to complex garment modeling using parameterized templates; and enables thorough garment separation allowing for detailed editing applications. The implementation of clothing transfer and color editing exemplifies its utility in generating flexible and interactive avatar experiences.

Practically, this model facilitates technology applications in areas such as gaming and e-commerce, particularly enhancing virtual try-ons and avatar customizations. Theoretically, it promotes further research in understanding and optimizing garment separation in machine-learning contexts. Looking forward, the detailed exploration and proposed solutions by GGAvatar could stimulate continued advancements in the development of expressive and manipulable digital humans.

Conclusion

GGAvatar represents a well-rounded approach to constructing editable 3D avatars, addressing a critical gap in existing methodologies by efficiently disentangling and modeling garments. Despite its clear strengths, future research could explore expanding its applicability to broader scenes and more complex clothing types, as well as potentially integrating more advanced neural representations to further enhance realism and performance in diverse environments.

PDF Markdown