Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

FaceScape: a Large-scale High Quality 3D Face Dataset and Detailed Riggable 3D Face Prediction (2003.13989v3)

Published 31 Mar 2020 in cs.CV

Abstract: In this paper, we present a large-scale detailed 3D face dataset, FaceScape, and propose a novel algorithm that is able to predict elaborate riggable 3D face models from a single image input. FaceScape dataset provides 18,760 textured 3D faces, captured from 938 subjects and each with 20 specific expressions. The 3D models contain the pore-level facial geometry that is also processed to be topologically uniformed. These fine 3D facial models can be represented as a 3D morphable model for rough shapes and displacement maps for detailed geometry. Taking advantage of the large-scale and high-accuracy dataset, a novel algorithm is further proposed to learn the expression-specific dynamic details using a deep neural network. The learned relationship serves as the foundation of our 3D face prediction system from a single image input. Different than the previous methods, our predicted 3D models are riggable with highly detailed geometry under different expressions. The unprecedented dataset and code will be released to public for research purpose.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Haotian Yang (16 papers)
  2. Hao Zhu (212 papers)
  3. Yanru Wang (8 papers)
  4. Mingkai Huang (5 papers)
  5. Qiu Shen (25 papers)
  6. Ruigang Yang (68 papers)
  7. Xun Cao (77 papers)
Citations (262)

Summary

FaceScape: A Large-scale High Quality 3D Face Dataset and Detailed Riggable 3D Face Prediction

The paper "FaceScape: A Large-scale High Quality 3D Face Dataset and Detailed Riggable 3D Face Prediction" introduces the FaceScape dataset, an extensive and high-quality 3D face dataset designed to advance the state of research in computer vision and computer graphics. The authors have also proposed an algorithm that predicts riggable 3D face models with high geometric detail from a single image.

The FaceScape dataset is comprised of 18,760 textured 3D facial models collected from 938 subjects. Each subject performed 20 specific facial expressions. Unique to this dataset is its pore-level geometry and topologically uniformed structure, enabling robust 3D morphable models. Existing 3D face datasets either lack detailed geometry or vary significantly in quality and scale. This new dataset closes that gap by utilizing a dense 68-camera array setup, making it superior in acquiring wrinkle and pore-level details. The models are represented using a combination of a 3D morphable model for the rough shapes and displacement maps for the details, achieving a processing size 98% smaller than the original surfaces while maintaining accuracy.

Leveraging this dataset, the authors developed a novel deep neural network algorithm that learns expression-specific dynamic details. This forms the basis for predicting detailed, riggable 3D face models from single 2D images—a feature distinguishing it from prior methods that merely focus on static facial recovery, without the ability to project model expressions in real-time.

The predictive system consists of three stages: base model fitting, displacement map prediction, and dynamic detail synthesis. The proposed framework refreshingly extends 3D Morphable Model (3DMM) capabilities by introducing dynamic details synthesis – effectively integrating temporal expression modifications and corresponding facial surface variations, thus achieving a realistic representation of facial dynamics.

Evidence of this methodology's efficacy is established through comprehensive experiments, wherein the predicted 3D face models visually matched ground-truth models closely, verified by low mean reconstruction errors. This prominence in representation, particularly against methods such as FaceWarehouse and others, underscores the algorithm's competencies.

The potential of this work manifests in various applications, including facial animation, recognition systems that require robustness to facial deformations, and possibly interactive applications in digital entertainment. The proposed pipeline, if integrated with real-time video parsing techniques, could revolutionize facial reconstruction applications by enabling instant animations and personalized avatar creations.

For the future, there are opportunities to extend this endeavor, such as by introducing variance in skin tone, structural diversity beyond Asian demographics, and optimizing the displacement map predictions to accommodate varying light conditions and occlusions. Moreover, while the dataset and resultant models provide unparalleled detail and dynamic rigging capabilities, merging these insights with upcoming AI frameworks could further elevate the synthesis of human-like avatars and the accuracy of interaction-based facial systems.

In summary, this paper makes noteworthy contributions by providing the FaceScape dataset and developing a robust prediction algorithm. The work innovatively enables the creation of expressions with high geometric detail from static images, marking a significant step forward in facial modeling and dynamic reconstruction.