GPAvatar: Generalizable and Precise Head Avatar from Image(s) (2401.10215v1)
Abstract: Head avatar reconstruction, crucial for applications in virtual reality, online meetings, gaming, and film industries, has garnered substantial attention within the computer vision community. The fundamental objective of this field is to faithfully recreate the head avatar and precisely control expressions and postures. Existing methods, categorized into 2D-based warping, mesh-based, and neural rendering approaches, present challenges in maintaining multi-view consistency, incorporating non-facial information, and generalizing to new identities. In this paper, we propose a framework named GPAvatar that reconstructs 3D head avatars from one or several images in a single forward pass. The key idea of this work is to introduce a dynamic point-based expression field driven by a point cloud to precisely and effectively capture expressions. Furthermore, we use a Multi Tri-planes Attention (MTA) fusion module in the tri-planes canonical field to leverage information from multiple input images. The proposed method achieves faithful identity reconstruction, precise expression control, and multi-view consistency, demonstrating promising results for free-viewpoint rendering and novel view synthesis.
- Rignerf: Fully controllable neural 3d portraits. In Proceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition, pp. 20364–20373, 2022.
- Flame-in-nerf: Neural control of radiance fields for free view face animation. In 2023 IEEE 17th International Conference on Automatic Face and Gesture Recognition (FG), pp. 1–8. IEEE, 2023.
- Learning personalized high quality volumetric head avatars from monocular rgb videos. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16890–16900, 2023.
- A morphable model for the synthesis of 3d faces. In Proceedings of the 26th annual conference on Computer graphics and interactive techniques, 1999.
- How far are we from solving the 2d & 3d face alignment problem? (and a dataset of 230,000 3d facial landmarks). In IEEE International Conference on Computer Vision (ICCV), 2017.
- EMOCA: Emotion driven monocular face capture and animation. In Conference on Computer Vision and Pattern Recognition (CVPR), pp. 20311–20322, 2022.
- Arcface: Additive angular margin loss for deep face recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019.
- Megaportraits: One-shot megapixel neural head avatars. Proceedings of the 30th ACM International Conference on Multimedia, 2022.
- Learning an animatable detailed 3D face model from in-the-wild images. ACM Transactions on Graphics (ToG), Proc. SIGGRAPH, pp. 88:1–88:13, 2021.
- Dynamic neural radiance fields for monocular 4d facial avatar reconstruction. In Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
- Morphable face models-an open framework. In 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), pp. 75–82, 2018.
- Ad-nerf: Audio driven neural radiance fields for talking head synthesis. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5784–5794, 2021.
- Headnerf: A real-time nerf-based parametric head model. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
- Perceptual losses for real-time style transfer and super-resolution. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part II 14, pp. 694–711. Springer, 2016.
- A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 4401–4410, 2019.
- Realistic one-shot mesh-based head avatars. In European Conference on Computer Vision (ECCV), 2022.
- Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
- Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 2012.
- Learning a model of facial shape and expression from 4D scans. ACM Transactions on Graphics, (Proc. SIGGRAPH Asia), pp. 194:1–194:17, 2017.
- One-shot high-fidelity talking-head synthesis with deformable neural radiance field. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023a.
- Generalizable one-shot neural head avatar. Arxiv, 2023b.
- Otavatar: One-shot talking face avatar with controllable tri-plane rendering. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
- Nerf: Representing scenes as neural radiance fields for view synthesis. In European Conference on Computer Vision (ECCV), 2020.
- Nerfies: Deformable neural radiance fields. IEEE International Conference on Computer Vision (ICCV), 2021a.
- Hypernerf: A higher-dimensional representation for topologically varying neural radiance fields. ACM Trans. Graph., 2021b.
- Automatic differentiation in pytorch. 2017.
- A 3d face model for pose and illumination invariant face recognition. In 2009 sixth IEEE international conference on advanced video and signal based surveillance, pp. 296–301, 2009.
- Pirenderer: Controllable portrait image generation via semantic neural rendering. In IEEE International Conference on Computer Vision (ICCV), 2021.
- Pivotal tuning for latent-based editing of real images. ACM Trans. Graph., 2021.
- First order motion model for image animation. Advances in Neural Information Processing Systems (NeurIPS), 2019.
- Next3d: Generative neural texture rasterization for 3d-aware head avatars. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 20991–21002, 2023.
- Real-time neural radiance talking portrait synthesis via audio-spatial decomposition. arXiv preprint arXiv:2211.12368, 2022.
- Non-rigid neural radiance fields: Reconstruction and novel view synthesis of a dynamic scene from monocular video. In IEEE International Conference on Computer Vision (ICCV), 2021.
- Real-time radiance fields for single-image portrait view synthesis. In ACM Transactions on Graphics (SIGGRAPH), 2023.
- One-shot free-view neural talking-head synthesis for video conferencing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021a.
- Towards real-world blind face restoration with generative facial prior. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021b.
- Vfhq: A high-quality dataset and benchmark for video face super-resolution. In The IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2022.
- Point-nerf: Point-based neural radiance fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5438–5448, 2022.
- Latentavatar: Learning latent expression code for expressive neural head avatar. arXiv preprint arXiv:2305.01190, 2023.
- Styleheat: One-shot high-resolution editable talking face generation via pre-trained stylegan. In European Conference on Computer Vision (ECCV), 2022.
- Confies: Controllable neural face avatars. In 2023 IEEE 17th International Conference on Automatic Face and Gesture Recognition (FG), pp. 1–8. IEEE, 2023a.
- Nofa: Nerf-based one-shot facial avatar reconstruction. In ACM SIGGRAPH 2023 Conference Proceedings, pp. 1–12, 2023b.
- Fast bi-layer neural synthesis of one-shot realistic head avatars. In European Conference on Computer Vision (ECCV), 2020.
- The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 586–595, 2018.
- Sadtalker: Learning realistic 3d motion coefficients for stylized audio-driven single image talking face animation. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8652–8661, 2023.
- Flow-guided one-shot talking face generation with a high-resolution audio-visual dataset. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021.
- Im avatar: Implicit morphable head avatars from videos. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13545–13555, 2022.
- Instant volumetric head avatars. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4574–4584, 2023.
- Xuangeng Chu (7 papers)
- Yu Li (378 papers)
- Ailing Zeng (58 papers)
- Tianyu Yang (67 papers)
- Lijian Lin (11 papers)
- Yunfei Liu (40 papers)
- Tatsuya Harada (142 papers)