Efficient 3D Implicit Head Avatar with Mesh-anchored Hash Table Blendshapes (2404.01543v1)
Abstract: 3D head avatars built with neural implicit volumetric representations have achieved unprecedented levels of photorealism. However, the computational cost of these methods remains a significant barrier to their widespread adoption, particularly in real-time applications such as virtual reality and teleconferencing. While attempts have been made to develop fast neural rendering approaches for static scenes, these methods cannot be simply employed to support realistic facial expressions, such as in the case of a dynamic facial performance. To address these challenges, we propose a novel fast 3D neural implicit head avatar model that achieves real-time rendering while maintaining fine-grained controllability and high rendering quality. Our key idea lies in the introduction of local hash table blendshapes, which are learned and attached to the vertices of an underlying face parametric model. These per-vertex hash-tables are linearly merged with weights predicted via a CNN, resulting in expression dependent embeddings. Our novel representation enables efficient density and color predictions using a lightweight MLP, which is further accelerated by a hierarchical nearest neighbor search method. Extensive experiments show that our approach runs in real-time while achieving comparable rendering quality to state-of-the-arts and decent results on challenging expressions.
- MetaHuman - Unreal Engine. https://www.unrealengine.com/en-US/metahuman. Accessed: 2022-10-17.
- VRChat. https://hello.vrchat.com. Accessed: 2023-02-05.
- XHolo Virtual Assitant. https://www.digalix.com/en/virtual-assistant-augmented-reality-xholo. Accessed: 2023-03-01.
- RigNeRF: Fully Controllable Neural 3D Portraits. pages 20364–20373, 2022.
- Learning personalized high quality volumetric head avatars from monocular rgb videos. In CVPR, pages 16890–16900, 2023.
- High-Quality Passive Facial Performance Capture Using Anchor Frames. In ACM SIGGRAPH 2011 Papers, pages 1–10. 2011.
- Authentic volumetric avatars from a phone scan. ACM TOG, 2022.
- Tensorf: Tensorial radiance fields. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXXII, pages 333–350. Springer, 2022a.
- Implicit neural head synthesis via controllable local deformation fields. In CVPR, pages 416–426, 2023.
- Mobilenerf: Exploiting the polygon rasterization pipeline for efficient neural field rendering on mobile architectures. arXiv preprint arXiv:2208.00277, 2022b.
- MeshLab: an Open-Source Mesh Processing Tool. In Eurographics Italian Chapter Conference. The Eurographics Association, 2008.
- Head2head++: Deep Facial Attributes Re-Targeting. IEEE Transactions on Biometrics, Behavior, and Identity Science, 3(1):31–43, 2021.
- 3D Morphable Face Models—Past, Present, and Future. 39(5):1–38, 2020.
- Dynamic neural radiance fields for monocular 4d facial avatar reconstruction. In CVPR, pages 8649–8658, 2021.
- Reconstructing personalized semantic facial nerf models from monocular video. ACM Transactions on Graphics (Proceedings of SIGGRAPH Asia), 41(6), 2022.
- The Relightables: Volumetric Performance Capture of Humans With Realistic Relighting. ACM TOG, 2019.
- NeuMan: Neural Human Radiance Field From a Single Video. 2022.
- 3d gaussian splatting for real-time radiance field rendering. ACM TOG, 42(4):1–14, 2023.
- Deep Video Portraits. ACM TOG, 37(4):1–14, 2018.
- Head2head: Video-Based Neural Head Synthesis. In 2020 15th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020), pages 16–23. IEEE, IEEE, 2020.
- Learning a Model of Facial Shape and Expression From 4D Scans. ACM TOG, 36(6):194:1–194:17, 2017.
- Robust high-resolution video matting with temporal guidance. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 238–247, 2022.
- Neural sparse voxel fields. Advances in Neural Information Processing Systems, 33:15651–15663, 2020.
- Mixture of volumetric primitives for efficient neural rendering. ACM TOG, 40(4):1–13, 2021.
- Mediapipe: A framework for building perception pipelines. arXiv preprint arXiv:1906.08172, 2019.
- Deep Relightable Textures: Volumetric Performance Capture With Neural Rendering. ACM TOG, 39(6):1–21, 2020a.
- Deep relightable textures - volumetric performance capture with neural rendering. 2020b.
- NeRF: Representing Scenes As Neural Radiance Fields for View Synthesis. Communications of the ACM, 65(1):99–106, 2021.
- Instant neural graphics primitives with a multiresolution hash encoding. ACM Trans. Graph., 41(4):102:1–102:15, 2022.
- Holoportation: Virtual 3D Teleportation in Real-Time. In Proceedings of the 29th Annual Symposium on User Interface Software and Technology. ACM, 2016.
- Nerfies: Deformable Neural Radiance Fields. pages 5865–5874, 2021.
- Kilonerf: Speeding up neural radiance fields with thousands of tiny mlps. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 14335–14345, 2021.
- Direct voxel grid optimization: Super-fast convergence for radiance fields reconstruction. In CVPR, 2022.
- Advances in neural rendering. In Computer Graphics Forum, pages 703–735. Wiley Online Library, 2022.
- Zach Waggoner. My Avatar, My Self: Identity in Video Role-Playing Games. McFarland, 2009.
- Styleavatar: Real-time photo-realistic portrait avatar from a single video. In ACM SIGGRAPH 2023 Conference Proceedings, pages 1–10, 2023.
- Humannerf: Free-viewpoint rendering of moving people from monocular video. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16210–16220, 2022.
- Avatarmav: Fast 3d head avatar reconstruction using motion-aware neural voxels. In ACM SIGGRAPH 2023 Conference Proceedings, pages 1–10, 2023a.
- Latentavatar: Learning latent expression code for expressive neural head avatar. In ACM SIGGRAPH 2023 Conference Proceedings, pages 1–10, 2023b.
- Plenoctrees for real-time rendering of neural radiance fields. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 5752–5761, 2021.
- The Unreasonable Effectiveness of Deep Features As a Perceptual Metric. pages 586–595, 2018.
- Havatar: High-fidelity head avatar via facial model conditioned neural radiance field. ACM TOG, 43(1):1–16, 2023.
- Im avatar: Implicit morphable head avatars from videos. In CVPR, pages 13545–13555, 2022.
- Pointavatar: Deformable point-based head avatars from videos. In CVPR, pages 21057–21067, 2023.
- Instant volumetric head avatars. In CVPR, pages 4574–4584, 2023.
- State of the Art on Monocular 3D Face Reconstruction, Tracking, and Applications. pages 523–550, 2018.