GANFIT: Generative Adversarial Network Fitting for High Fidelity 3D Face Reconstruction (1902.05978v2)

Published 15 Feb 2019 in cs.CV

Abstract: In the past few years, a lot of work has been done towards reconstructing the 3D facial structure from single images by capitalizing on the power of Deep Convolutional Neural Networks (DCNNs). In the most recent works, differentiable renderers were employed in order to learn the relationship between the facial identity features and the parameters of a 3D morphable model for shape and texture. The texture features either correspond to components of a linear texture space or are learned by auto-encoders directly from in-the-wild images. In all cases, the quality of the facial texture reconstruction of the state-of-the-art methods is still not capable of modeling textures in high fidelity. In this paper, we take a radically different approach and harness the power of Generative Adversarial Networks (GANs) and DCNNs in order to reconstruct the facial texture and shape from single images. That is, we utilize GANs to train a very powerful generator of facial texture in UV space. Then, we revisit the original 3D Morphable Models (3DMMs) fitting approaches making use of non-linear optimization to find the optimal latent parameters that best reconstruct the test image but under a new perspective. We optimize the parameters with the supervision of pretrained deep identity features through our end-to-end differentiable framework. We demonstrate excellent results in photorealistic and identity preserving 3D face reconstructions and achieve for the first time, to the best of our knowledge, facial texture reconstruction with high-frequency details.

PDF Abstract

Overview of GANFIT: Generative Adversarial Network Fitting for High Fidelity 3D Face Reconstruction

The paper "GANFIT: Generative Adversarial Network Fitting for High Fidelity 3D Face Reconstruction" presents an advanced approach for reconstructing 3D facial structures with high-fidelity texture detail from single images by effectively combining Generative Adversarial Networks (GANs) and Deep Convolutional Neural Networks (DCNNs). The authors redefine the standard 3D Morphable Models (3DMMs) fitting techniques, utilizing GANs to significantly enhance the level of detail and fidelity in the reconstructed textures.

Contributions

The primary contributions of this research are:

GAN for Texture Modelling: The authors employ a GAN as a statistical model to represent facial texture in UV space, achieving high-resolution and photorealistic texture reconstructions for the first time, to their knowledge. This contrasts with traditional PCA-based models, which fail to capture high-frequency details.
Differentiable Fitting Strategy: A novel fitting strategy integrates a differentiable renderer with traditional 3DMM fitting techniques, allowing for efficient use of first-order derivatives during optimization. This facilitates leveraging deep networks both as a statistical generator and as a cost function.
Effective Loss Functions: The authors introduce a comprehensive cost function that employs deep identity features from a face recognition network, significantly enhancing identity preservation during reconstruction.

Methodology

Texture Modelling: The paper utilizes a GAN trained with high-resolution UV maps of facial textures. This network acts as a generator and is capable of producing detailed and photorealistic textures, which are vital for capturing subtle identity-specific features.
Differentiable Rendering: By using a differentiable renderer, the approach optimizes the 3DMM fitting parameters in conjunction with texture generation, accommodating real-world conditions such as varying lighting and expression.
Composite Loss Function: Combining identity losses sourced from face recognition models, pixel alignment losses, and landmark-based alignment losses ensures comprehensive optimization of all reconstruction parameters, including shape, texture, and illumination.

Experimental Evaluation

Qualitative & Quantitative Results: The method demonstrates superior performance in both qualitative and quantitative evaluations against state-of-the-art techniques. It shows resilience in capturing detailed textures and achieving a lower reconstruction error, as illustrated by experiments on datasets like MICC Florence and LFW.
Ablation Studies: The paper provides detailed studies illustrating the importance of each component of their framework, confirming that each novel aspect contributes to the overall performance improvement.

Implications and Future Directions

The proposed methodology enhances the realism and identity preservation in 3D face reconstructions, which are crucial in applications such as realistic avatar creation, virtual reality, and robust facial recognition systems. Future research might explore extending this framework to dynamic sequences, enabling real-time 3D facial dynamics reconstruction. Furthermore, integrating additional conditional inputs, such as emotional cues or multi-view integration, could further enrich the fidelity and applicability of the reconstructions.

In summary, this paper presents a robust approach for high-fidelity 3D face reconstruction utilizing GANs and DCNNs, marking a significant advancement in texture realism over conventional methods. Its contributions open avenues for further exploration in high-quality 3D modeling influenced by modern machine learning techniques.

PDF Markdown Bookmark Chat (Pro)

Authors (4)

Baris Gecer (15 papers)
Stylianos Ploumpis (17 papers)
Irene Kotsia (13 papers)
Stefanos Zafeiriou (137 papers)

Citations (327)

View on Semantic Scholar