Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
184 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Facial Expression Re-targeting from a Single Character (2306.12188v1)

Published 21 Jun 2023 in cs.GR, cs.AI, and cs.CV

Abstract: Video retargeting for digital face animation is used in virtual reality, social media, gaming, movies, and video conference, aiming to animate avatars' facial expressions based on videos of human faces. The standard method to represent facial expressions for 3D characters is by blendshapes, a vector of weights representing the avatar's neutral shape and its variations under facial expressions, e.g., smile, puff, blinking. Datasets of paired frames with blendshape vectors are rare, and labeling can be laborious, time-consuming, and subjective. In this work, we developed an approach that handles the lack of appropriate datasets. Instead, we used a synthetic dataset of only one character. To generalize various characters, we re-represented each frame to face landmarks. We developed a unique deep-learning architecture that groups landmarks for each facial organ and connects them to relevant blendshape weights. Additionally, we incorporated complementary methods for facial expressions that landmarks did not represent well and gave special attention to eye expressions. We have demonstrated the superiority of our approach to previous research in qualitative and quantitative metrics. Our approach achieved a higher MOS of 68% and a lower MSE of 44.2% when tested on videos with various users and expressions.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (26)
  1. V. Kazemi and J. Sullivan, “One millisecond face alignment with an ensemble of regression trees,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2014, pp. 1867–1874.
  2. C. Cao, Y. Weng, S. Lin, and K. Zhou, “3d shape regression for real-time facial animation,” ACM Transactions on Graphics (TOG), vol. 32, no. 4, pp. 1–10, 2013.
  3. A. Siarohin, S. Lathuiliere, S. Tulyakov, E. Ricci, and N. Sebe, “Animating arbitrary objects via deep motion transfer,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2019.
  4. I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial networks,” Communications of the ACM, vol. 63, no. 11, pp. 139–144, 2020.
  5. C. Cao, Q. Hou, and K. Zhou, “Displaced dynamic expression regression for real-time facial tracking and animation,” ACM Transactions on graphics (TOG), vol. 33, no. 4, pp. 1–10, 2014.
  6. C. Cao, M. Chai, O. Woodford, and L. Luo, “Stabilized real-time face tracking via a learned dynamic rigidity prior,” ACM Transactions on Graphics (TOG), vol. 37, no. 6, pp. 1–11, 2018.
  7. T. Kroeger, R. Timofte, D. Dai, and L. Van Gool, “Fast optical flow using dense inverse search,” in Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part IV 14.   Springer, 2016, pp. 471–488.
  8. J. M. D. Barros, V. Golyanik, K. Varanasi, and D. Stricker, “Face it!: A pipeline for real-time performance-driven facial animation,” in 2019 IEEE International Conference on Image Processing (ICIP).   IEEE, 2019, pp. 2209–2213.
  9. Z. Peng, B. Jiang, H. Xu, W. Feng, and J. Zhang, “Facial optical flow estimation via neural non-rigid registration,” Computational Visual Media, vol. 9, no. 1, pp. 109–122, 2023.
  10. L. Moser, C. Chien, M. Williams, J. Serra, D. Hendler, and D. Roble, “Semi-supervised video-driven facial animation transfer for production,” ACM Transactions on Graphics (TOG), vol. 40, no. 6, pp. 1–18, 2021.
  11. D. Aneja, B. Chaudhuri, A. Colburn, G. Faigin, L. Shapiro, and B. Mones, “Learning to generate 3d stylized character expressions from humans,” in 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).   IEEE, 2018, pp. 160–169.
  12. A. Dhall, R. Goecke, S. Lucey, and T. Gedeon, “Static facial expression analysis in tough conditions: Data, evaluation protocol and benchmark,” in 2011 IEEE international conference on computer vision workshops (ICCV workshops).   IEEE, 2011, pp. 2106–2112.
  13. P. Lucey, J. F. Cohn, T. Kanade, J. Saragih, Z. Ambadar, and I. Matthews, “The extended cohn-kanade dataset (ck+): A complete dataset for action unit and emotion-specified expression,” in 2010 ieee computer society conference on computer vision and pattern recognition-workshops.   IEEE, 2010, pp. 94–101.
  14. S. M. Mavadati, M. H. Mahoor, K. Bartlett, P. Trinh, and J. F. Cohn, “Disfa: A spontaneous facial action intensity database,” IEEE Transactions on Affective Computing, vol. 4, no. 2, pp. 151–160, 2013.
  15. M. Pantic, M. Valstar, R. Rademaker, and L. Maat, “Web-based database for facial expression analysis,” in 2005 IEEE international conference on multimedia and Expo.   IEEE, 2005, pp. 5–pp.
  16. B. Egger, W. A. Smith, A. Tewari, S. Wuhrer, M. Zollhoefer, T. Beeler, F. Bernard, T. Bolkart, A. Kortylewski, S. Romdhani et al., “3d morphable face models—past, present, and future,” ACM Transactions on Graphics (TOG), vol. 39, no. 5, pp. 1–38, 2020.
  17. S. Sanyal, T. Bolkart, H. Feng, and M. J. Black, “Learning to regress 3d face shape and expression from an image without 3d supervision,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 7763–7772.
  18. B. Chaudhuri, N. Vesdapunt, and B. Wang, “Joint face detection and facial motion retargeting for multiple faces,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 9719–9728.
  19. B. Chaudhuri, N. Vesdapunt, L. Shapiro, and B. Wang, “Personalized face modeling for improved face reconstruction and motion retargeting,” in European Conference on Computer Vision.   Springer, 2020, pp. 142–160.
  20. A. Bulat and G. Tzimiropoulos, “How far are we from solving the 2d & 3d face alignment problem?(and a dataset of 230,000 3d facial landmarks),” in Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 1021–1030.
  21. B. Doosti, S. Naha, M. Mirbagheri, and D. J. Crandall, “Hope-net: A graph-based model for hand-object pose estimation,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 6608–6617.
  22. K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” CoRR, vol. abs/1512.03385, 2015. [Online]. Available: http://arxiv.org/abs/1512.03385
  23. Skullvez, “Synthetic 3d characters data creator,” this work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this license, visit https://creativecommons.org/licenses/by/4.0/legalcode. [Online]. Available: https://sketchfab.com/skullvez
  24. A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga et al., “Pytorch: An imperative style, high-performance deep learning library,” Advances in neural information processing systems, vol. 32, 2019.
  25. D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
  26. X. Glorot and Y. Bengio, “Understanding the difficulty of training deep feedforward neural networks,” in Proceedings of the thirteenth international conference on artificial intelligence and statistics.   JMLR Workshop and Conference Proceedings, 2010, pp. 249–256.
Citations (1)

Summary

We haven't generated a summary for this paper yet.