Multiface: A Dataset for Neural Face Rendering (2207.11243v2)
Abstract: Photorealistic avatars of human faces have come a long way in recent years, yet research along this area is limited by a lack of publicly available, high-quality datasets covering both, dense multi-view camera captures, and rich facial expressions of the captured subjects. In this work, we present Multiface, a new multi-view, high-resolution human face dataset collected from 13 identities at Reality Labs Research for neural face rendering. We introduce Mugsy, a large scale multi-camera apparatus to capture high-resolution synchronized videos of a facial performance. The goal of Multiface is to close the gap in accessibility to high quality data in the academic community and to enable research in VR telepresence. Along with the release of the dataset, we conduct ablation studies on the influence of different model architectures toward the model's interpolation capacity of novel viewpoint and expressions. With a conditional VAE model serving as our baseline, we found that adding spatial bias, texture warp field, and residual connections improves performance on novel view synthesis. Our code and data is available at: https://github.com/facebookresearch/multiface
- Facewarehouse: A 3d facial expression database for visual computing. IEEE Transactions on Visualization and Computer Graphics, 20(3):413–425, 2014.
- Expressive telepresence via modular codec avatars. In European Conference on Computer Vision, pages 330–345. Springer, 2020.
- Massively parallel multiview stereopsis by surface normal diffusion. In 2015 IEEE International Conference on Computer Vision (ICCV), pages 873–881, 2015.
- Deltille grids for geometric camera calibration. In Proceedings of the IEEE International Conference on Computer Vision, pages 5344–5352, 2017.
- Deep residual learning for image recognition. CoRR, abs/1512.03385, 2015.
- Adam: A method for stochastic optimization, 2017.
- Modular primitives for high-performance differentiable rendering. ACM Transactions on Graphics, 39(6), 2020.
- Deep appearance models for face rendering. ACM Trans. Graph., 37(4), jul 2018.
- Neural volumes. ACM Transactions on Graphics, 38(4):1–14, Aug 2019.
- Mixture of volumetric primitives for efficient neural rendering. ACM Transactions on Graphics (TOG), 40(4):1–13, 2021.
- Pixel codec avatars. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 64–73, 2021.
- Nerf: Representing scenes as neural radiance fields for view synthesis. CoRR, abs/2003.08934, 2020.
- Strand-accurate multi-view hair capture. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 155–164, 2019.
- Deformable neural radiance fields. CoRR, abs/2011.12948, 2020.
- D-nerf: Neural radiance fields for dynamic scenes. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 10313–10322, 2021.
- Audio-and gaze-driven facial animation of codec avatars. In Proceedings of the IEEE/CVF winter conference on applications of computer vision, pages 41–50, 2021.
- Meshtalk: 3d face animation from speech using cross-modality disentanglement. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 1173–1182, 2021.
- The eyes have it: An integrated eye and face model for photorealistic facial animation. ACM Transactions on Graphics (TOG), 39(4):91–1, 2020.
- Deforming autoencoders: Unsupervised disentangling of shape and appearance, 2018.
- Constraining dense hand surface tracking with elasticity. ACM Trans. Graph., 39(6), nov 2020.
- Human hair inverse rendering using multi-view photometric data. 2021.
- Learning compositional radiance fields of dynamic human heads. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5704–5713, 2021.
- Humbi: A large multiview dataset of human body expressions, 2020.