HHAvatar: Gaussian Head Avatar with Dynamic Hairs (2312.03029v3)
Abstract: Creating high-fidelity 3D head avatars has always been a research hotspot, but it remains a great challenge under lightweight sparse view setups. In this paper, we propose HHAvatar represented by controllable 3D Gaussians for high-fidelity head avatar with dynamic hair modeling. We first use 3D Gaussians to represent the appearance of the head, and then jointly optimize neutral 3D Gaussians and a fully learned MLP-based deformation field to capture complex expressions. The two parts benefit each other, thereby our method can model fine-grained dynamic details while ensuring expression accuracy. Furthermore, we devise a well-designed geometry-guided initialization strategy based on implicit SDF and Deep Marching Tetrahedra for the stability and convergence of the training procedure. To address the problem of dynamic hair modeling, we introduce a hybrid head model into our avatar representation based Gaussian Head Avatar and a training method that considers timing information and an occlusion perception module to model the non-rigid motion of hair. Experiments show that our approach outperforms other state-of-the-art sparse-view methods, achieving ultra high-fidelity rendering quality at 2K resolution even under exaggerated expressions and driving hairs reasonably with the motion of the head
- Neural point-based graphics. In European Conference on Computer Vision, pages 696–712, 2020.
- Rignerf: Fully controllable neural 3d portraits. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
- Flame-in-nerf: Neural control of radiance fields for free view face animation. In IEEE 17th International Conference on Automatic Face and Gesture Recognition (FG), pages 1–8, 2023.
- High-quality single-shot capture of facial geometry. ACM Trans. Graph., 29(4), 2010.
- Deep relightable appearance models for animatable faces. ACM Trans. Graph., 40(4), 2021.
- A morphable model for the synthesis of 3d faces. In 26th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH 1999), pages 187–194. ACM Press, 1999.
- Instant multi-view head capture through learnable registration. In Conference on Computer Vision and Pattern Recognition (CVPR), pages 768–779, 2023.
- High resolution passive facial performance capture. 29(4), 2010.
- Multilinear wavelets: A statistical shape space for human faces. In Proceedings of the Proceedings of the European Conference on Computer Vision (ECCV), 2014.
- How far are we from solving the 2d & 3d face alignment problem? (and a dataset of 230,000 3d facial landmarks). In International Conference on Computer Vision, 2017.
- Facewarehouse: A 3d facial expression database for visual computing. In IEEE Transactions on Visualization and Computer Graphics, pages 413–425, 2014.
- Real-time high-fidelity facial performance capture. ACM Trans. Graph., 34(4), 2015.
- Real-time facial animation with image-based dynamic avatars. ACM Trans. Graph., 35(4), 2016.
- Authentic volumetric avatars from a phone scan. ACM Trans. Graph., 41(4), 2022.
- Accurate 3d face reconstruction with weakly-supervised learning: From single image to image set. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2019.
- Dynamic neural radiance fields for monocular 4d facial avatar reconstruction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 8645–8654, 2021.
- Reconstructing personalized semantic facial nerf models from monocular video. ACM Transactions on Graphics (Proceedings of SIGGRAPH Asia), 41(6), 2022.
- Morphable face models - an open framework. pages 75–82, 2017.
- Multiview face capture using polarized spherical gradient illumination. ACM Trans. Graph., 30(6):1–10, 2011.
- Neural head avatars from monocular rgb videos. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 18632–18643, 2022.
- Ad-nerf: Audio driven neural radiance fields for talking head synthesis. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 5764–5774, 2021.
- Gans trained by a two time-scale update rule converge to a local nash equilibrium. In Neural Information Processing Systems, 2017.
- Headnerf: A real-time nerf-based parametric head model. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 20374–20384, 2022.
- Avatar digitization from a single image for real-time rendering. ACM Trans. Graph., 36(6), 2017.
- Dynamic 3d avatar creation from hand-held video input. ACM Trans. Graph., 34(4), 2015.
- 3d gaussian splatting for real-time radiance field rendering. ACM Transactions on Graphics, 42(4), 2023.
- Realistic one-shot mesh-based head avatars. In Proceedings of the European Conference on Computer Vision (ECCV), 2022.
- Adam: A method for stochastic optimization, 2017.
- Nersemble: Multi-view radiance field reconstruction of human heads. ACM Trans. Graph., 42(4), 2023.
- Neural point catacaustics for novel-view synthesis of reflections. ACM Transactions on Graphics (TOG), 41(6):1–15, 2022.
- Pulsar: Efficient sphere-based neural rendering. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 1440–1449, 2021.
- The digital michelangelo project: 3d scanning of large statues. In Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques, page 131–144, USA, 2000. ACM Press/Addison-Wesley Publishing Co.
- Learning a model of facial shape and expression from 4d scans. ACM Trans. Graph., 36(6), 2017.
- Topologically consistent multi-view face inference using volumetric sampling. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 3824–3834, 2021.
- Real-time high-resolution background matting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
- Semantic-aware implicit neural audio-driven video portrait generation. In Proceedings of the European Conference on Computer Vision (ECCV), 2022.
- Deep appearance models for face rendering. ACM Trans. Graph., 37(4):68:1–68:13, 2018.
- Neural volumes: Learning dynamic renderable volumes from images. ACM Trans. Graph., 38(4):65:1–65:14, 2019.
- Mixture of volumetric primitives for efficient neural rendering. ACM Trans. Graph., 40(4), 2021.
- Dynamic 3d gaussians: Tracking by persistent dynamic view synthesis, 2023.
- Pixel codec avatars. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 64–73, 2021.
- Keypointnerf: Generalizing image-based volumetric avatars using relative spatial encoding of keypoints. In European conference on computer vision, 2022.
- Nerf: Representing scenes as neural radiance fields for view synthesis. In Proceedings of the European Conference on Computer Vision (ECCV), 2020.
- Extracting Triangular 3D Models, Materials, and Lighting From Images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 8280–8290, 2022.
- Pagan: Real-time avatars using dynamic textures. ACM Trans. Graph., 37(6), 2018.
- A 3d face model for pose and illumination invariant face recognition. In 2009 Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance, pages 296–301, 2009.
- Pva: Pixel-aligned volumetric avatars. In arXiv:2101.02697, 2020.
- Adop: Approximate differentiable one-pixel point rendering. ACM Trans. Graph., 41(4), 2022.
- Deep marching tetrahedra: a hybrid representation for high-resolution 3d shape synthesis. In Advances in Neural Information Processing Systems (NeurIPS), 2021a.
- Deep marching tetrahedra: a hybrid representation for high-resolution 3d shape synthesis. In Advances in Neural Information Processing Systems (NeurIPS), 2021b.
- Next3d: Generative neural texture rasterization for 3d-aware head avatars. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
- Neural point-based volumetric avatar: Surface-guided neural points for efficient and photorealistic volumetric head avatar. In ACM SIGGRAPH Asia 2023 Conference Proceedings, 2023.
- Morf: Morphable radiance fields for multiview neural head modeling. In ACM SIGGRAPH 2022 Conference Proceedings, New York, NY, USA, 2022a. Association for Computing Machinery.
- Faceverse: a fine-grained and detail-controllable 3d face morphable model from a hybrid dataset. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022b.
- Learning compositional radiance fields of dynamic human heads. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 5704–5713, 2021.
- SynSin: End-to-end view synthesis from a single image. In CVPR, 2020.
- 4d gaussian splatting for real-time dynamic scene rendering, 2023a.
- Ganhead: Towards generative animatable neural head avatars. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 437–447, 2023b.
- Detailed facial geometry recovery from multi-view images by learning an implicit function. In Proceedings of the AAAI Conference on Artificial Intelligence, 2022.
- Point-nerf: Point-based neural radiance fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5438–5448, 2022.
- Avatarmav: Fast 3d head avatar reconstruction using motion-aware neural voxels. In ACM SIGGRAPH 2023 Conference Proceedings, 2023a.
- Latentavatar: Learning latent expression code for expressive neural head avatar. In ACM SIGGRAPH 2023 Conference Proceedings, 2023b.
- Asm: Adaptive skinning model for high-quality 3d face modeling supplementary material. 2021.
- Deformable 3d gaussians for high-fidelity monocular dynamic scene reconstruction, 2023a.
- Real-time photorealistic dynamic scene representation and rendering with 4d gaussian splatting, 2023b.
- i3dmm: Deep implicit 3d morphable model of human heads. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
- Differentiable surface splatting for point-based geometry processing. ACM Transactions on Graphics (proceedings of ACM SIGGRAPH ASIA), 38(6), 2019.
- The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 586–595, 2018.
- Havatar: High-fidelity head avatar via facial model conditioned neural radiance field. ACM Trans. Graph., 2023. Just Accepted.
- I m avatar: Implicit morphable head avatars from videos. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 13535–13545, 2022.
- Pointavatar: Deformable point-based head avatars from videos. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
- Mofanerf: Morphable facial neural radiance field. In Proceedings of the European Conference on Computer Vision (ECCV), 2022.
- Instant volumetric head avatars, 2022.