Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

HHAvatar: Gaussian Head Avatar with Dynamic Hairs (2312.03029v3)

Published 5 Dec 2023 in cs.CV and cs.GR

Abstract: Creating high-fidelity 3D head avatars has always been a research hotspot, but it remains a great challenge under lightweight sparse view setups. In this paper, we propose HHAvatar represented by controllable 3D Gaussians for high-fidelity head avatar with dynamic hair modeling. We first use 3D Gaussians to represent the appearance of the head, and then jointly optimize neutral 3D Gaussians and a fully learned MLP-based deformation field to capture complex expressions. The two parts benefit each other, thereby our method can model fine-grained dynamic details while ensuring expression accuracy. Furthermore, we devise a well-designed geometry-guided initialization strategy based on implicit SDF and Deep Marching Tetrahedra for the stability and convergence of the training procedure. To address the problem of dynamic hair modeling, we introduce a hybrid head model into our avatar representation based Gaussian Head Avatar and a training method that considers timing information and an occlusion perception module to model the non-rigid motion of hair. Experiments show that our approach outperforms other state-of-the-art sparse-view methods, achieving ultra high-fidelity rendering quality at 2K resolution even under exaggerated expressions and driving hairs reasonably with the motion of the head

Definition Search Book Streamline Icon: https://streamlinehq.com
References (73)
  1. Neural point-based graphics. In European Conference on Computer Vision, pages 696–712, 2020.
  2. Rignerf: Fully controllable neural 3d portraits. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
  3. Flame-in-nerf: Neural control of radiance fields for free view face animation. In IEEE 17th International Conference on Automatic Face and Gesture Recognition (FG), pages 1–8, 2023.
  4. High-quality single-shot capture of facial geometry. ACM Trans. Graph., 29(4), 2010.
  5. Deep relightable appearance models for animatable faces. ACM Trans. Graph., 40(4), 2021.
  6. A morphable model for the synthesis of 3d faces. In 26th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH 1999), pages 187–194. ACM Press, 1999.
  7. Instant multi-view head capture through learnable registration. In Conference on Computer Vision and Pattern Recognition (CVPR), pages 768–779, 2023.
  8. High resolution passive facial performance capture. 29(4), 2010.
  9. Multilinear wavelets: A statistical shape space for human faces. In Proceedings of the Proceedings of the European Conference on Computer Vision (ECCV), 2014.
  10. How far are we from solving the 2d & 3d face alignment problem? (and a dataset of 230,000 3d facial landmarks). In International Conference on Computer Vision, 2017.
  11. Facewarehouse: A 3d facial expression database for visual computing. In IEEE Transactions on Visualization and Computer Graphics, pages 413–425, 2014.
  12. Real-time high-fidelity facial performance capture. ACM Trans. Graph., 34(4), 2015.
  13. Real-time facial animation with image-based dynamic avatars. ACM Trans. Graph., 35(4), 2016.
  14. Authentic volumetric avatars from a phone scan. ACM Trans. Graph., 41(4), 2022.
  15. Accurate 3d face reconstruction with weakly-supervised learning: From single image to image set. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2019.
  16. Dynamic neural radiance fields for monocular 4d facial avatar reconstruction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 8645–8654, 2021.
  17. Reconstructing personalized semantic facial nerf models from monocular video. ACM Transactions on Graphics (Proceedings of SIGGRAPH Asia), 41(6), 2022.
  18. Morphable face models - an open framework. pages 75–82, 2017.
  19. Multiview face capture using polarized spherical gradient illumination. ACM Trans. Graph., 30(6):1–10, 2011.
  20. Neural head avatars from monocular rgb videos. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 18632–18643, 2022.
  21. Ad-nerf: Audio driven neural radiance fields for talking head synthesis. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 5764–5774, 2021.
  22. Gans trained by a two time-scale update rule converge to a local nash equilibrium. In Neural Information Processing Systems, 2017.
  23. Headnerf: A real-time nerf-based parametric head model. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 20374–20384, 2022.
  24. Avatar digitization from a single image for real-time rendering. ACM Trans. Graph., 36(6), 2017.
  25. Dynamic 3d avatar creation from hand-held video input. ACM Trans. Graph., 34(4), 2015.
  26. 3d gaussian splatting for real-time radiance field rendering. ACM Transactions on Graphics, 42(4), 2023.
  27. Realistic one-shot mesh-based head avatars. In Proceedings of the European Conference on Computer Vision (ECCV), 2022.
  28. Adam: A method for stochastic optimization, 2017.
  29. Nersemble: Multi-view radiance field reconstruction of human heads. ACM Trans. Graph., 42(4), 2023.
  30. Neural point catacaustics for novel-view synthesis of reflections. ACM Transactions on Graphics (TOG), 41(6):1–15, 2022.
  31. Pulsar: Efficient sphere-based neural rendering. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 1440–1449, 2021.
  32. The digital michelangelo project: 3d scanning of large statues. In Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques, page 131–144, USA, 2000. ACM Press/Addison-Wesley Publishing Co.
  33. Learning a model of facial shape and expression from 4d scans. ACM Trans. Graph., 36(6), 2017.
  34. Topologically consistent multi-view face inference using volumetric sampling. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 3824–3834, 2021.
  35. Real-time high-resolution background matting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
  36. Semantic-aware implicit neural audio-driven video portrait generation. In Proceedings of the European Conference on Computer Vision (ECCV), 2022.
  37. Deep appearance models for face rendering. ACM Trans. Graph., 37(4):68:1–68:13, 2018.
  38. Neural volumes: Learning dynamic renderable volumes from images. ACM Trans. Graph., 38(4):65:1–65:14, 2019.
  39. Mixture of volumetric primitives for efficient neural rendering. ACM Trans. Graph., 40(4), 2021.
  40. Dynamic 3d gaussians: Tracking by persistent dynamic view synthesis, 2023.
  41. Pixel codec avatars. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 64–73, 2021.
  42. Keypointnerf: Generalizing image-based volumetric avatars using relative spatial encoding of keypoints. In European conference on computer vision, 2022.
  43. Nerf: Representing scenes as neural radiance fields for view synthesis. In Proceedings of the European Conference on Computer Vision (ECCV), 2020.
  44. Extracting Triangular 3D Models, Materials, and Lighting From Images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 8280–8290, 2022.
  45. Pagan: Real-time avatars using dynamic textures. ACM Trans. Graph., 37(6), 2018.
  46. A 3d face model for pose and illumination invariant face recognition. In 2009 Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance, pages 296–301, 2009.
  47. Pva: Pixel-aligned volumetric avatars. In arXiv:2101.02697, 2020.
  48. Adop: Approximate differentiable one-pixel point rendering. ACM Trans. Graph., 41(4), 2022.
  49. Deep marching tetrahedra: a hybrid representation for high-resolution 3d shape synthesis. In Advances in Neural Information Processing Systems (NeurIPS), 2021a.
  50. Deep marching tetrahedra: a hybrid representation for high-resolution 3d shape synthesis. In Advances in Neural Information Processing Systems (NeurIPS), 2021b.
  51. Next3d: Generative neural texture rasterization for 3d-aware head avatars. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
  52. Neural point-based volumetric avatar: Surface-guided neural points for efficient and photorealistic volumetric head avatar. In ACM SIGGRAPH Asia 2023 Conference Proceedings, 2023.
  53. Morf: Morphable radiance fields for multiview neural head modeling. In ACM SIGGRAPH 2022 Conference Proceedings, New York, NY, USA, 2022a. Association for Computing Machinery.
  54. Faceverse: a fine-grained and detail-controllable 3d face morphable model from a hybrid dataset. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022b.
  55. Learning compositional radiance fields of dynamic human heads. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 5704–5713, 2021.
  56. SynSin: End-to-end view synthesis from a single image. In CVPR, 2020.
  57. 4d gaussian splatting for real-time dynamic scene rendering, 2023a.
  58. Ganhead: Towards generative animatable neural head avatars. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 437–447, 2023b.
  59. Detailed facial geometry recovery from multi-view images by learning an implicit function. In Proceedings of the AAAI Conference on Artificial Intelligence, 2022.
  60. Point-nerf: Point-based neural radiance fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5438–5448, 2022.
  61. Avatarmav: Fast 3d head avatar reconstruction using motion-aware neural voxels. In ACM SIGGRAPH 2023 Conference Proceedings, 2023a.
  62. Latentavatar: Learning latent expression code for expressive neural head avatar. In ACM SIGGRAPH 2023 Conference Proceedings, 2023b.
  63. Asm: Adaptive skinning model for high-quality 3d face modeling supplementary material. 2021.
  64. Deformable 3d gaussians for high-fidelity monocular dynamic scene reconstruction, 2023a.
  65. Real-time photorealistic dynamic scene representation and rendering with 4d gaussian splatting, 2023b.
  66. i3dmm: Deep implicit 3d morphable model of human heads. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
  67. Differentiable surface splatting for point-based geometry processing. ACM Transactions on Graphics (proceedings of ACM SIGGRAPH ASIA), 38(6), 2019.
  68. The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 586–595, 2018.
  69. Havatar: High-fidelity head avatar via facial model conditioned neural radiance field. ACM Trans. Graph., 2023. Just Accepted.
  70. I m avatar: Implicit morphable head avatars from videos. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 13535–13545, 2022.
  71. Pointavatar: Deformable point-based head avatars from videos. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
  72. Mofanerf: Morphable facial neural radiance field. In Proceedings of the European Conference on Computer Vision (ECCV), 2022.
  73. Instant volumetric head avatars, 2022.
Citations (50)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com