SplattingAvatar: Realistic Real-Time Human Avatars with Mesh-Embedded Gaussian Splatting (2403.05087v1)

Published 8 Mar 2024 in cs.GR and cs.CV

Abstract: We present SplattingAvatar, a hybrid 3D representation of photorealistic human avatars with Gaussian Splatting embedded on a triangle mesh, which renders over 300 FPS on a modern GPU and 30 FPS on a mobile device. We disentangle the motion and appearance of a virtual human with explicit mesh geometry and implicit appearance modeling with Gaussian Splatting. The Gaussians are defined by barycentric coordinates and displacement on a triangle mesh as Phong surfaces. We extend lifted optimization to simultaneously optimize the parameters of the Gaussians while walking on the triangle mesh. SplattingAvatar is a hybrid representation of virtual humans where the mesh represents low-frequency motion and surface deformation, while the Gaussians take over the high-frequency geometry and detailed appearance. Unlike existing deformation methods that rely on an MLP-based linear blend skinning (LBS) field for motion, we control the rotation and translation of the Gaussians directly by mesh, which empowers its compatibility with various animation techniques, e.g., skeletal animation, blend shapes, and mesh editing. Trainable from monocular videos for both full-body and head avatars, SplattingAvatar shows state-of-the-art rendering quality across multiple datasets.

References (54)

Citations (46)

View on Semantic Scholar

Summary

The paper presents a novel hybrid representation that integrates mesh-based geometry with Gaussian splatting for real-time photorealistic avatar rendering.
It employs a dual-layer approach by disentangling motion from appearance to achieve efficient and detailed reconstruction.
Empirical evaluations show the method attains over 300 FPS on high-end GPUs and 30 FPS on mobile, highlighting its practical real-time applications.

Overview of SplattingAvatar: Realistic Real-Time Human Avatars

The paper presents SplattingAvatar, a novel method for creating photorealistic human avatars using a hybrid 3D representation that involves Gaussian Splatting embedded on a triangle mesh. This approach enables real-time rendering capabilities on both high-performance GPUs and mobile devices, achieving over 300 FPS on an NVIDIA RTX 3090 and 30 FPS on an iPhone 13. By effectively disentangling motion and appearance, SplattingAvatar maintains fidelity and efficiency without the computational overhead often associated with 3D modeling complexities.

Approach and Methodology

SplattingAvatar leverages a dual-layer representation: the triangle mesh captures the low-frequency motion and surface deformation, while Gaussian splats model high-frequency geometry and appearance. This separation allows the system to employ explicit mesh-based geometry for motion control and implicit Gaussian splats for detailed rendering. Unlike conventional methods that often depend on MLP-based linear blend skinning (LBS) for motion, SplattingAvatar directly controls Gaussian rotations and translations through mesh deformations. This design lends itself to compatibility with various animation techniques, enhancing adaptability and applicability across different contexts.

A central innovation is the trainable Gaussian embedding, described by barycentric coordinates on the mesh as Phong surfaces. This method ensures Gaussians can walk across the mesh surfaces during optimization, simultaneously optimizing both Gaussian parameters and their embeddings in a lifted optimization framework. The result is a cohesive method that addresses previously rigid connectivity issues associated with mesh vertices.

Key Contributions and Results

Integration of Gaussian Splatting with Meshes: The paper proposes a unified framework that combines Gaussian Splatting with mesh controls, offering an improved method for avatar representation that balances realism with computational efficiency.
Lifted Optimization for Enhanced Reconstruction: SplattingAvatar utilizes lifted optimization for the simultaneous refinement of Gaussian parameters and mesh embeddings, ensuring accuracy in virtual human appearance and motion capture.
Real-Time Rendering Demonstrations: The method supports real-time applications with comprehensive evaluations and a successful Unity implementation. Rendering performance metrics highlight significant enhancements compared to existing state-of-the-art techniques.

Empirical evaluations across multiple datasets demonstrate SplattingAvatar's ability to achieve superior rendering quality. Visual and quantitative analyses show marked improvements in capturing complex geometries, especially in regions demanding high-frequency detail, such as facial features and accessories.

Implications and Future Work

The proposed method addresses several challenges in avatar representation, particularly around balancing detail and efficiency. By decoupling motion and appearance through Gaussian-mesh embedding, SplattingAvatar enhances avatar realism without the need for extensive computational resources. This has profound implications for gaming, extended reality (XR), and real-time telepresence applications.

Future research could focus on expanding the disentangled mesh representations, potentially exploring separate meshes for dynamic and static components such as clothing and hair. Additionally, the exploration of other computational platforms or optimizable environments could further broaden the applicability of this method.

SplattingAvatar represents a robust step forward in real-time avatar rendering, demonstrating the potential for highly detailed and animatable virtual humans within practical resource constraints.

Related Papers

GitHub

GitHub - initialneil/SplattingAvatar: [CVPR2024] Official implementation of SplattingAvatar. (312 stars)

YouTube

Show All Videos