Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash 93 tok/s
Gemini 2.5 Pro 52 tok/s Pro
GPT-5 Medium 17 tok/s
GPT-5 High 14 tok/s Pro
GPT-4o 97 tok/s
GPT OSS 120B 455 tok/s Pro
Kimi K2 194 tok/s Pro
2000 character limit reached

US-GAN: On the importance of Ultimate Skip Connection for Facial Expression Synthesis (2112.13002v2)

Published 24 Dec 2021 in cs.CV and eess.IV

Abstract: We demonstrate the benefit of using an ultimate skip (US) connection for facial expression synthesis using generative adversarial networks (GAN). A direct connection transfers identity, facial, and color details from input to output while suppressing artifacts. The intermediate layers can therefore focus on expression generation only. This leads to a light-weight US-GAN model comprised of encoding layers, a single residual block, decoding layers, and an ultimate skip connection from input to output. US-GAN has $3\times$ fewer parameters than state-of-the-art models and is trained on $2$ orders of magnitude smaller dataset. It yields $7\%$ increase in face verification score (FVS) and $27\%$ decrease in average content distance (ACD). Based on a randomized user-study, US-GAN outperforms the state of the art by $25\%$ in face realism, $43\%$ in expression quality, and $58\%$ in identity preservation.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (36)
  1. Generative Adversarial Nets. In Advances in Neural Information Processing Systems, pages 2672–2680, 2014.
  2. Conditional Generative Adversarial Nets. arXiv preprint arXiv:1411.1784, 2014.
  3. Invertible Conditional GANs for Image Editing. arXiv preprint arXiv:1611.06355, 2016.
  4. Image-to-image Translation with Conditional Adversarial Networks. In IEEE Conference on Computer Vision and Pattern Recognition, pages 1125–1134, 2017.
  5. Unpaired Image-to-Image Iranslation using Cycle-consistent Adversarial Networks. In IEEE International Conference on Computer Vision, pages 2223–2232, 2017.
  6. Learning spatial attention for face super-resolution. IEEE Transactions on Image Processing, 30:1219–1231, 2020.
  7. FSRNet: End-to-end learning face super-resolution with facial priors. In IEEE Conference on Computer Vision and Pattern Recognition, pages 2492–2501, 2018.
  8. DualGAN: Unsupervised Dual Learning for Image-to-Image Translation. In IEEE International Conference on Computer Vision, pages 2849–2857, 2017.
  9. RelGAN: Multi-domain Image-to-Image Translation via Relative Attributes. In IEEE International Conference on Computer Vision, pages 5914–5922, 2019.
  10. High-Fidelity and Arbitrary Face Editing. In IEEE Conference on Computer Vision and Pattern Recognition, pages 16115–16124, 2021.
  11. StarGAN: Unified Generative Adversarial Networks for Multi-domain Image-to-Image Translation. In IEEE Conference on Computer Vision and Pattern Recognition, pages 8789–8797, 2018.
  12. GANimation: One-shot Anatomically Consistent Facial Animation. International Journal of Computer Vision, 128(3):698–713, 2020.
  13. EGGAN: Learning Latent Space for Fine-Grained Expression Manipulation. IEEE MultiMedia, 2021.
  14. Masked Linear Regression for Learning Local Receptive Fields for Facial Expression Synthesis. International Journal of Computer Vision, 128(5):1433–1454, 2020.
  15. Pixel-based Facial Expression Synthesis. In International Conference on Pattern Recognition, pages 9733–9739. IEEE, 2021.
  16. Domain Adaptive Image-to-image Translation. In IEEE Conference on Computer Vision and Pattern Recognition, pages 5274–5283, 2020.
  17. GANmut: Learning Interpretable Conditional Space for Gamut of Emotions. In IEEE Conference on Computer Vision and Pattern Recognition, pages 568–577, 2021.
  18. Perceptual Losses for Real-time Style Transfer and Super-resolution. In European Conference on Computer Vision, pages 694–711. Springer, 2016.
  19. U-Net: Convolutional Networks for Biomedical Image Segmentation. In International Conference on Medical image computing and computer-assisted intervention, pages 234–241. Springer, 2015.
  20. ExprGAN: Facial Expression Editing with Controllable Expression Intensity. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 32, 2018.
  21. Geometry-Contrastive Generative Adversarial Network for Facial Expression Synthesis. arXiv preprint arXiv:1802.01822, 2018.
  22. EmotioNet: An Accurate, Real-Time Algorithm for the Automatic Annotation of a Million Facial Expressions in the Wild. In IEEE Conference on Computer Vision and Pattern Recognition, pages 5562–5570, 2016.
  23. STGAN: A Unified Selective Transfer Network for Arbitrary Image Attribute Editing. In IEEE Conference on Computer Vision and Pattern Recognition, pages 3673–3682, 2019.
  24. Cascade EF-GAN: Progressive Facial Expression Editing with Local Focuses. In IEEE Conference on Computer Vision and Pattern Recognition, pages 5021–5030, 2020.
  25. AffectNet: A Database for Facial Expression, Valence, and Arousal Computing in the Wild. IEEE Transactions on Affective Computing, 10(1):18–31, 2017.
  26. Deep Residual Learning for Image Recognition. In IEEE Conference on Computer Vision and Pattern Recognition, pages 770–778, 2016.
  27. Instance Normalization: The Missing Ingredient for Fast Stylization. arXiv preprint arXiv:1607.08022, 2016.
  28. Kunihiko Fukushima. Cognitron: A self-organizing multilayered neural network. Biological cybernetics, 20(3):121–136, 1975.
  29. Image Restoration using Very Deep Convolutional Encoder-Decoder Networks with Symmetric Skip Connections. Advances in Neural Information Processing Systems, 29:2802–2810, 2016.
  30. Learning Residual Images for Face Attribute Manipulation. In IEEE Conference on Computer Vision and Pattern Recognition, pages 4030–4038, 2017.
  31. Wasserstein generative adversarial networks. In International conference on machine learning, pages 214–223. PMLR, 2017.
  32. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
  33. The Karolinska Directed Emotional Faces - KDEF, CD ROM. Department of Clinical Neuroscience, Psychology section, Karolinska Institutet, Stockholm, Sweden, 1998.
  34. Presentation and validation of the Radboud Faces Database. Cognition and emotion, 24(8):1377–1388, 2010.
  35. Compound facial expressions of emotion. Proceedings of the National Academy of Sciences, 111(15):E1454–E1462, 2014.
  36. Densely Connected Convolutional Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017.
Citations (9)
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Ai Generate Text Spark Streamline Icon: https://streamlinehq.com

Paper Prompts

Sign up for free to create and run prompts on this paper using GPT-5.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Authors (2)