NPGA: Neural Parametric Gaussian Avatars (2405.19331v2)

Published 29 May 2024 in cs.CV, cs.AI, and cs.GR

Abstract: The creation of high-fidelity, digital versions of human heads is an important stepping stone in the process of further integrating virtual components into our everyday lives. Constructing such avatars is a challenging research problem, due to a high demand for photo-realism and real-time rendering performance. In this work, we propose Neural Parametric Gaussian Avatars (NPGA), a data-driven approach to create high-fidelity, controllable avatars from multi-view video recordings. We build our method around 3D Gaussian splatting for its highly efficient rendering and to inherit the topological flexibility of point clouds. In contrast to previous work, we condition our avatars' dynamics on the rich expression space of neural parametric head models (NPHM), instead of mesh-based 3DMMs. To this end, we distill the backward deformation field of our underlying NPHM into forward deformations which are compatible with rasterization-based rendering. All remaining fine-scale, expression-dependent details are learned from the multi-view videos. For increased representational capacity of our avatars, we propose per-Gaussian latent features that condition each primitives dynamic behavior. To regularize this increased dynamic expressivity, we propose Laplacian terms on the latent features and predicted dynamics. We evaluate our method on the public NeRSemble dataset, demonstrating that NPGA significantly outperforms the previous state-of-the-art avatars on the self-reenactment task by 2.6 PSNR. Furthermore, we demonstrate accurate animation capabilities from real-world monocular videos.

References (51)

Citations (3)

View on Semantic Scholar

Summary

The paper introduces NPGA to create realistic digital avatars using neural parametric models and 3D Gaussian splatting for enhanced control.
It employs a canonical Gaussian point cloud with dual MLP modules to capture both coarse and fine dynamic expressions.
Evaluation on the NeRSemble dataset demonstrates significant improvements in PSNR, SSIM, and LPIPS metrics over traditional avatar methods.

NPGA: Neural Parametric Gaussian Avatars

In the paper "NPGA: Neural Parametric Gaussian Avatars," the authors present a method for creating high-fidelity, controllable digital avatars. The approach harnesses multi-view video recordings to enable seamless integration of virtual avatars into various applications, including AR/VR, teleconferencing, and digital media.

This effort is driven by the inherent challenges in creating realistic avatars, such as ensuring photo-realism and achieving real-time rendering. The authors introduce the Neural Parametric Gaussian Avatars (NPGA), which leverage 3D Gaussian splatting for efficient rendering and introduce neural parametric head models (NPHM) to condition avatar dynamics. This method diverges from traditional 3D morphable models (3DMMs) that are mesh-based and limited by their linear nature. Instead, NPGA capitalizes on NPHM to capture a broader expression space with more nuanced dynamic behavior.

Methodology

The proposed method is built around a canonical Gaussian point cloud augmented with per-primitive latent features. These features govern the dynamic behavior of the avatars, providing enriched representation capabilities. The dynamics module, a key component of NPGA, consists of two Multi-Layer Perceptrons (MLPs). The network $F$ is responsible for handling coarse, prior-based deformation, while the network $G$ captures finer details beyond this prior.

A novel strategy called cycle-consistency distillation is employed to convert the backward deformations inherent in NPHM to forward deformations, making them compatible with rasterization-based rendering. This technique optimizes the network $F$ to act as the inverse of the NPHM backward deformation, ensuring that the facial dynamics remain aligned with the neural parametric model.

Implementation and Evaluation

The authors evaluate their approach on the NeRSemble dataset, demonstrating significant enhancements over existing methods. NPGA outperforms traditional GaussianAvatar and GaussianHeadAvatar models on self-reenactment tasks by achieving approximately 2.6 PSNR improvement and notable gains in SSIM and LPIPS metrics. Additionally, NPGA exhibits robust performance in cross-reenactment scenarios and demonstrates the feasibility of avatar animation using monocular RGB tracking in real-world conditions.

Results

The evaluation highlights NPGA’s capacity for creating avatars with higher fidelity and nuanced dynamic expressions, outperforming baselines in both qualitative and quantitative measures. For instance, NPGA achieves an average PSNR of 37.68 compared to 33.92 (GHA) and 33.42 (MVP) in novel view synthesis tasks. These improvements are a testament to the effective integration of per-primitive features and the cycle-consistency approach.

Implications and Future Work

The implications of this research are significant for the future development of digital avatars and related technologies. By leveraging a neural parametric model, NPGA provides a more expressive and controllable framework for avatar animation. This can foster advancements in immersive applications spanning gaming, virtual environments, and telepresence.

Moving forward, the authors suggest extending the underlying 3DMMs to encompass more comprehensive descriptions, including the neck and torso, which are currently inadequately represented. Additionally, there is potential for adopting large-scale multi-view datasets to further enhance the fidelity and generalization of neural models used in avatar creation.

In summary, "NPGA: Neural Parametric Gaussian Avatars" offers a compelling solution to the challenge of creating high-fidelity digital avatars, integrating efficient rendering techniques with advanced neural parametric models to achieve superior dynamic expressivity and visual realism. The approach sets a new benchmark in the quest for responsive and lifelike virtual human representations.

PDF Markdown

Related Papers

Tweets

https://twitter.com/janusch_patas/status/1796038401885344088

https://twitter.com/_akhaliq/status/1796189816633147579

https://twitter.com/rohanpaul_ai/status/1796325514602533333

https://twitter.com/taziku_co/status/1835639461717438973

https://twitter.com/arxivsanitybot/status/1796172448456495258

https://twitter.com/jessebenisrael/status/1796334729186181237

YouTube

Show All Videos

HackerNews

NPGA: Neural Parametric Gaussian Avatars – high-fidelity digital faces (60 points, 14 comments)