Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

GAN-Avatar: Controllable Personalized GAN-based Human Head Avatar (2311.13655v1)

Published 22 Nov 2023 in cs.CV

Abstract: Digital humans and, especially, 3D facial avatars have raised a lot of attention in the past years, as they are the backbone of several applications like immersive telepresence in AR or VR. Despite the progress, facial avatars reconstructed from commodity hardware are incomplete and miss out on parts of the side and back of the head, severely limiting the usability of the avatar. This limitation in prior work stems from their requirement of face tracking, which fails for profile and back views. To address this issue, we propose to learn person-specific animatable avatars from images without assuming to have access to precise facial expression tracking. At the core of our method, we leverage a 3D-aware generative model that is trained to reproduce the distribution of facial expressions from the training data. To train this appearance model, we only assume to have a collection of 2D images with the corresponding camera parameters. For controlling the model, we learn a mapping from 3DMM facial expression parameters to the latent space of the generative model. This mapping can be learned by sampling the latent space of the appearance model and reconstructing the facial parameters from a normalized frontal view, where facial expression estimation performs well. With this scheme, we decouple 3D appearance reconstruction and animation control to achieve high fidelity in image synthesis. In a series of experiments, we compare our proposed technique to state-of-the-art monocular methods and show superior quality while not requiring expression tracking of the training data.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (67)
  1. The digital emily project: Photoreal facial modeling and animation. In ACM SIGGRAPH 2009 Courses, New York, NY, USA, 2009. Association for Computing Machinery.
  2. Learning to reconstruct people in clothing from a single RGB camera. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
  3. Panohead: Geometry-aware 3d full-head synthesis in 360°. ArXiv, abs/2303.13071, 2023.
  4. Learning personalized high quality volumetric head avatars from monocular rgb videos. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16890–16900, 2023.
  5. High-quality single-shot capture of facial geometry. ACM SIGGRAPH 2010 papers, 2010.
  6. High-quality passive facial performance capture using anchor frames. ACM Trans. Graph., 30(4), 2011.
  7. State-of-the-art in the architecture, methods and applications of stylegan, 2022.
  8. Multi-garment net: Learning to dress 3d people from images. In IEEE International Conference on Computer Vision (ICCV). IEEE, 2019.
  9. Combining implicit function learning and parametric models for 3d human reconstruction. In European Conference on Computer Vision (ECCV). Springer, 2020a.
  10. Loopreg: Self-supervised learning of implicit surface correspondences, pose and shape for 3d human mesh registration. In Advances in Neural Information Processing Systems (NeurIPS), 2020b.
  11. Behave: Dataset and method for tracking human object interactions. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
  12. A morphable model for the synthesis of 3d faces. pages 187–194, 1999a.
  13. A morphable model for the synthesis of 3d faces. pages 187–194, 1999b.
  14. Authentic volumetric avatars from a phone scan. ACM Transactions on Graphics (TOG), 41:1 – 19, 2022.
  15. pi-gan: Periodic implicit generative adversarial networks for 3d-aware image synthesis. In CVPR, 2021.
  16. Efficient geometry-aware 3d generative adversarial networks, 2022.
  17. Implicit neural head synthesis via controllable local deformation fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023.
  18. Snarf: Differentiable forward skinning for animating non-rigid neural implicit shapes. pages 11574–11584, 2021.
  19. Acquiring the reflectance field of a human face. Proceedings of the 27th annual conference on Computer graphics and interactive techniques, 2000.
  20. Accurate 3d face reconstruction with weakly-supervised learning: From single image to image set. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pages 285–295, 2019.
  21. Learning an animatable detailed 3d face model from in-the-wild images. ACM Transactions on Graphics (TOG), 40:1 – 13, 2020.
  22. Learning disentangled avatars with hybrid 3d representations. arXiv preprint arXiv:2309.06441, 2023.
  23. Dynamic neural radiance fields for monocular 4d facial avatar reconstruction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 8649–8658, 2021.
  24. Reconstructing detailed dynamic face geometry from monocular video. ACM Transactions on Graphics (TOG), 32:1 – 10, 2013.
  25. Neural head avatars from monocular rgb videos, 2022.
  26. The relightables. ACM Transactions on Graphics (TOG), 38:1 – 19, 2019.
  27. A style-based generator architecture for generative adversarial networks. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 4396–4405, 2018.
  28. Analyzing and improving the image quality of stylegan. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 8107–8116, 2019.
  29. Alias-free generative adversarial networks. In Neural Information Processing Systems, 2021.
  30. Nersemble: Multi-view radiance field reconstruction of human heads. 2023.
  31. 3d gan inversion with pose optimization. WACV, 2023.
  32. Production-level facial performance capture using deep convolutional neural networks. In Proceedings of the ACM SIGGRAPH / Eurographics Symposium on Computer Animation, New York, NY, USA, 2017. Association for Computing Machinery.
  33. Megane: Morphable eyeglass and avatar network. ArXiv, abs/2302.04868, 2023.
  34. Learning a model of facial shape and expression from 4d scans. ACM Trans. Graph., 36(6):194:1–194:17, 2017.
  35. 3d gan inversion for controllable portrait image animation, 2022.
  36. Robust high-resolution video matting with temporal guidance, 2021a.
  37. Robust high-resolution video matting with temporal guidance. CoRR, abs/2108.11515, 2021b.
  38. Neural volumes: Learning dynamic renderable volumes from images. ACM Trans. Graph., 38(4):65:1–65:14, 2019.
  39. Mixture of volumetric primitives for efficient neural rendering. ACM Trans. Graph., 40(4), 2021.
  40. Decoupled weight decay regularization, 2019.
  41. Pixel codec avatars. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 64–73, 2021.
  42. Nerf: Representing scenes as neural radiance fields for view synthesis. ArXiv, abs/2003.08934, 2020.
  43. Film: Visual reasoning with a general conditioning layer. In AAAI Conference on Artificial Intelligence, 2017.
  44. D-nerf: Neural radiance fields for dynamic scenes. pages 10313–10322, 2020.
  45. Scanimate: Weakly supervised learning of skinned clothed avatar networks. pages 2885–2896, 2021.
  46. Structure-from-motion revisited. In Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
  47. Pixelwise view selection for unstructured multi-view stereo. In European Conference on Computer Vision (ECCV), 2016.
  48. Implicit neural representations with periodic activation functions. ArXiv, abs/2006.09661, 2020.
  49. Explicitly controllable 3d-aware portrait generation. arXiv preprint arXiv:2209.05434, 2022.
  50. Hq3davatar: High quality controllable 3d head avatar. 2023.
  51. Stylerig: Rigging stylegan for 3d control over portrait images, cvpr 2020. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2020.
  52. Face2face: Real-time face capture and reenactment of RGB videos. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27-30, 2016, pages 2387–2395. IEEE Computer Society, 2016.
  53. Deferred neural rendering: Image synthesis using neural textures. ACM Transactions on Graphics 2019 (TOG), 2019.
  54. Lightweight binocular facial performance capture under uncontrolled lighting. ACM Transactions on Graphics (TOG), 31:1 – 11, 2012.
  55. Morf: Morphable radiance fields for multiview neural head modeling. ACM SIGGRAPH 2022 Conference Proceedings, 2022.
  56. Performance relighting and reflectance transformation with time-multiplexed illumination. ACM Trans. Graph., 24:756–764, 2005.
  57. Anifacegan: Animatable 3d-aware face image generation for video avatars. In Advances in Neural Information Processing Systems, 2022.
  58. Multiface: A dataset for neural face rendering. In arXiv, 2022.
  59. Chore: Contact, human and object reconstruction from a single rgb image. In European Conference on Computer Vision (ECCV). Springer, 2022.
  60. Visibility aware human-object interaction tracking from single rgb camera. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
  61. Latentavatar: Learning latent expression code for expressive neural head avatar. In ACM SIGGRAPH 2023 Conference Proceedings, 2023.
  62. Nsf: Neural surface fields for human modeling from monocular depth. In ICCV, 2023.
  63. I M avatar: Implicit morphable head avatars from videos. In IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), pages 13545–13555, 2022.
  64. Pointavatar: Deformable point-based head avatars from videos. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
  65. Toch: Spatio-temporal object-to-hand correspondence for motion refinement. In European Conference on Computer Vision (ECCV). Springer, 2022.
  66. Towards metrical reconstruction of human faces. In European Conference on Computer Vision, 2022.
  67. Instant volumetric head avatars. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Berna Kabadayi (3 papers)
  2. Wojciech Zielonka (6 papers)
  3. Bharat Lal Bhatnagar (19 papers)
  4. Gerard Pons-Moll (81 papers)
  5. Justus Thies (62 papers)
Citations (5)

Summary

We haven't generated a summary for this paper yet.