Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

HeadEvolver: Text to Head Avatars via Expressive and Attribute-Preserving Mesh Deformation (2403.09326v3)

Published 14 Mar 2024 in cs.GR and cs.AI

Abstract: Current text-to-avatar methods often rely on implicit representations (e.g., NeRF, SDF, and DMTet), leading to 3D content that artists cannot easily edit and animate in graphics software. This paper introduces a novel framework for generating stylized head avatars from text guidance, which leverages locally learnable mesh deformation and 2D diffusion priors to achieve high-quality digital assets for attribute-preserving manipulation. Given a template mesh, our method represents mesh deformation with per-face Jacobians and adaptively modulates local deformation using a learnable vector field. This vector field enables anisotropic scaling while preserving the rotation of vertices, which can better express identity and geometric details. We employ landmark- and contour-based regularization terms to balance the expressiveness and plausibility of generated avatars from multiple views without relying on any specific shape prior. Our framework can generate realistic shapes and textures that can be further edited via text, while supporting seamless editing using the preserved attributes from the template mesh, such as 3DMM parameters, blendshapes, and UV coordinates. Extensive experiments demonstrate that our framework can generate diverse and expressive head avatars with high-quality meshes that artists can easily manipulate in graphics software, facilitating downstream applications such as efficient asset creation and animation with preserved attributes.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (50)
  1. Neural Jacobian Fields: Learning Intrinsic Mappings of Arbitrary Meshes. ACM Trans. Graph., 41(4), Jul 2022.
  2. ClipFace: Text-Guided Editing of Textured 3D Morphable Models. In ACM SIGGRAPH 2023 Conference Proceedings, SIGGRAPH ’23, New York, NY, USA, 2023. Association for Computing Machinery.
  3. fast and deep facial deformations.
  4. A Morphable Model For The Synthesis Of 3D Faces. Association for Computing Machinery, New York, NY, USA, 1 edition, 2023.
  5. Dreamavatar: Text-and-shape Guided 3D Human Avatar Generation via Diffusion Models. arXiv preprint arXiv:2304.00916, 2023.
  6. Fantasia3D: Disentangling Geometry and Appearance for High-quality Text-to-3D Content Creation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 22246–22256, October 2023.
  7. Dancing with the Avatars: Minimal Avatar Customisation Enhances Learning in a Psychomotor Task. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, New York, NY, USA, 2023. Association for Computing Machinery.
  8. Automatic Unpaired Shape Deformation Transfer. ACM Trans. Graph., 37(6), dec 2018.
  9. TextDeformer: Geometry Manipulation Using Text Guidance. In ACM SIGGRAPH 2023 Conference Proceedings, SIGGRAPH ’23, New York, NY, USA, 2023. Association for Computing Machinery.
  10. Morphable Face Models - An Open Framework. In 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), pages 75–82, 2018.
  11. Learning Neural Parametric Head Models. In Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2023.
  12. Generative Adversarial Nets. Advances in neural information processing systems, 27, 2014.
  13. Headsculpt: Crafting 3D Head Avatars with Text. In Thirty-seventh Conference on Neural Information Processing Systems, 2023.
  14. Denoising Diffusion Probabilistic Models. Advances in neural information processing systems, 33:6840–6851, 2020.
  15. Automated Avatar Creation for 3D Games. In Proceedings of the 2007 Conference on Future Play, Future Play ’07, page 174–180, New York, NY, USA, 2007. Association for Computing Machinery.
  16. Avatarclip: Zero-shot text-driven generation and animation of 3d avatars. ACM Transactions on Graphics (TOG), 41(4):1–19, 2022.
  17. Avatar Digitization from a Single Image for Real-time Rendering. ACM Trans. Graph., 36(6), Nov 2017.
  18. DreamWaltz: Make a Scene with Complex 3D Animatable Avatars, 2023.
  19. Text2Control3D: Controllable 3D Avatar Generation in Neural Radiance Fields using Geometry-Guided Text-to-Image Diffusion Model. arXiv preprint arXiv:2309.03550, 2023.
  20. AvatarCraft: Transforming Text into Neural Human Avatars with Parameterized Shape and Pose Control. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 14371–14382, October 2023.
  21. DreamHuman: Animatable 3D Avatars from Text. arXiv preprint arXiv:2306.09329, 2023.
  22. Learning Formation of Physically-Based Face Attributes, 2020.
  23. Learning a Model of Facial Shape and Expression from 4D Scans. ACM Trans. Graph., 36(6):194–1, 2017.
  24. LucidDreamer: Towards High-Fidelity Text-to-3D Generation via Interval Score Matching, 2023.
  25. Tada! Text to Animatable Digital Avatars. arXiv preprint arXiv:2308.10899, 2023.
  26. Magic3d: High-resolution Text-to-3D Content Creation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 300–309, 2023.
  27. HeadArtist: Text-conditioned 3D Head Generation with Self Score Distillation. arXiv preprint arXiv:2312.07539, 2023.
  28. Marching cubes: A High Resolution 3D Surface Construction Algorithm. In Proceedings of the 14th Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH ’87, page 163–169, New York, NY, USA, 1987. Association for Computing Machinery.
  29. NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis. In European Conference on Computer Vision, pages 405–421, 2020.
  30. Large Steps in Inverse Rendering of Geometry. ACM Transactions on Graphics (Proceedings of SIGGRAPH Asia), 40(6), Dec 2021.
  31. Expressive Body Capture: 3D Hands, Face, and Body from a Single Image. In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pages 10975–10985, 2019.
  32. DreamFusion: Text-to-3D using 2D Diffusion. In The Eleventh International Conference on Learning Representations, 2022.
  33. Learning Transferable Visual Models From Natural Language Supervision. CoRR, abs/2103.00020, 2021.
  34. Deep Marching Tetrahedra: a Hybrid Representation for High-resolution 3D Shape Synthesis. Advances in Neural Information Processing Systems, 34:6087–6101, 2021.
  35. 3D Neural Field Generation Using Triplane Diffusion. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 20875–20886, 2023.
  36. DreamGaussian: Generative Gaussian Splatting for Efficient 3D Content Creation, 2023.
  37. NeuS: Learning Neural Implicit Surfaces by Volume Rendering for Multi-view Reconstruction. Advances in Neural Information Processing Systems, 34:27171–27183, 2021.
  38. Rodin: A Generative Model for sculpting 3D Digital Avatars Using Diffusion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4563–4573, 2023.
  39. ProlificDreamer: High-Fidelity and Diverse Text-to-3D Generation with Variational Score Distillation. In Advances in Neural Information Processing Systems (NeurIPS), 2023.
  40. Bridging the Generational Gap: Exploring How Virtual Reality Supports Remote Communication Between Grandparents and Grandchildren. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, CHI ’23, New York, NY, USA, 2023. Association for Computing Machinery.
  41. FaceScape: a Large-Scale High Quality 3D Face Dataset and Detailed Riggable 3D Face Prediction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
  42. Neural Cages for Detail-Preserving 3D Deformations. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 72–80, 2020.
  43. 3DStyleNet: Creating 3D Shapes with Geometric and Texture Style Variations. In Proceedings of International Conference on Computer Vision (ICCV), 2021.
  44. Towards High-fidelity Text-guided 3D Face Generation and Manipulation Using Only Images. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 15326–15337, 2023.
  45. StyleAvatar3D: Leveraging Image-Text Diffusion Models for High-Fidelity 3D Avatar Generation. arXiv preprint arXiv:2305.19012, 2023.
  46. Text-guided Generation and Editing of Compositional 3d Avatars. arXiv preprint arXiv:2309.07125, 2023.
  47. AvatarStudio: High-fidelity and Animatable 3D Avatar Creation from Text, 2023.
  48. DreamFace: Progressive Generation of Animatable 3D Faces under Text Guidance. ACM Trans. Graph., 42(4), jul 2023.
  49. Adding Conditional Control to Text-to-image Diffusion Models. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 3836–3847, 2023.
  50. A Deep Emulator for Secondary Motion of 3D Characters. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 5928–5936, 2021.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Duotun Wang (6 papers)
  2. Hengyu Meng (7 papers)
  3. Zeyu Cai (13 papers)
  4. Zhijing Shao (6 papers)
  5. Qianxi Liu (1 paper)
  6. Lin Wang (403 papers)
  7. Mingming Fan (55 papers)
  8. Xiaohang Zhan (27 papers)
  9. Zeyu Wang (137 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets