Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

F?D: On understanding the role of deep feature spaces on face generation evaluation (2305.20048v3)

Published 31 May 2023 in cs.CV

Abstract: Perceptual metrics, like the Fr\'echet Inception Distance (FID), are widely used to assess the similarity between synthetically generated and ground truth (real) images. The key idea behind these metrics is to compute errors in a deep feature space that captures perceptually and semantically rich image features. Despite their popularity, the effect that different deep features and their design choices have on a perceptual metric has not been well studied. In this work, we perform a causal analysis linking differences in semantic attributes and distortions between face image distributions to Fr\'echet distances (FD) using several popular deep feature spaces. A key component of our analysis is the creation of synthetic counterfactual faces using deep face generators. Our experiments show that the FD is heavily influenced by its feature space's training dataset and objective function. For example, FD using features extracted from ImageNet-trained models heavily emphasize hats over regions like the eyes and mouth. Moreover, FD using features from a face gender classifier emphasize hair length more than distances in an identity (recognition) feature space. Finally, we evaluate several popular face generation models across feature spaces and find that StyleGAN2 consistently ranks higher than other face generators, except with respect to identity (recognition) features. This suggests the need for considering multiple feature spaces when evaluating generative models and using feature spaces that are tuned to nuances of the domain of interest.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (46)
  1. Generating Diverse High-Fidelity Images with VQ-VAE-2. In Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019.
  2. NVAE: A Deep Hierarchical Variational Autoencoder. In Advances in Neural Information Processing Systems, volume 33, pages 19667–19679. Curran Associates, Inc., 2020.
  3. Generative adversarial networks. Commun. ACM, 63(11):139–144, October 2020.
  4. A Style-Based Generator Architecture for Generative Adversarial Networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4401–4410, 2019.
  5. Analyzing and Improving the Image Quality of StyleGAN. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8110–8119, 2020.
  6. Efficient Geometry-Aware 3D Generative Adversarial Networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16123–16133, 2022.
  7. Denoising Diffusion Probabilistic Models. In Advances in Neural Information Processing Systems, volume 33, pages 6840–6851. Curran Associates, Inc., 2020.
  8. Maximum Likelihood Training of Score-Based Diffusion Models. In Advances in Neural Information Processing Systems, November 2021.
  9. High-Resolution Image Synthesis With Latent Diffusion Models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10684–10695, 2022.
  10. Diffusion Models Beat GANs on Image Synthesis. In Advances in Neural Information Processing Systems, volume 34, pages 8780–8794. Curran Associates, Inc., 2021.
  11. A review on generative adversarial networks: Algorithms, theory, and applications. IEEE Transactions on Knowledge and Data Engineering, 35(4):3313–3332, 2023.
  12. A review on medical imaging synthesis using deep learning and its clinical applications. Journal of Applied Clinical Medical Physics, 22(1):11–36, 2021.
  13. Synthetic data in healthcare, 2023.
  14. Rethink reporting of evaluation results in AI. Science, 380(6641):136–138, April 2023.
  15. HYPE: A Benchmark for Human eYe Perceptual Evaluation of Generative Models. In Advances in Neural Information Processing Systems, volume 32, 2019.
  16. The Unreasonable Effectiveness of Deep Features as a Perceptual Metric. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 586–595, 2018.
  17. GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium. In Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc., 2017.
  18. An empirical study on evaluation metrics of generative adversarial networks. arXiv preprint arXiv:1806.07755, 2018.
  19. The Role of ImageNet Classes in Fréchet Inception Distance. In The Eleventh International Conference on Learning Representations, February 2023.
  20. Towards Causal Benchmarking of Biasin Face Analysis Algorithms. In Nalini K. Ratha, Vishal M. Patel, and Rama Chellappa, editors, Deep Learning-Based Face Analytics, Advances in Computer Vision and Pattern Recognition, pages 327–359. Springer International Publishing, Cham, 2021.
  21. Deepfakes and beyond: A Survey of face manipulation and fake detection. Information Fusion, 64:131–148, 2020.
  22. Justin N. M. Pinkney and Doron Adler. Resolution dependent gan interpolation for controllable image synthesis between domains, 2020.
  23. 3davatargan: Bridging domains for personalized editable avatars, 2023.
  24. Synthetic Generation of Face Videos With Plethysmograph Physiology. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 20587–20596, 2022.
  25. ImageNet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition, pages 248–255, June 2009.
  26. Learning Transferable Visual Models From Natural Language Supervision. In Proceedings of the 38th International Conference on Machine Learning, pages 8748–8763. PMLR, July 2021.
  27. Improved Precision and Recall Metric for Assessing Generative Models. In Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019.
  28. Improved Techniques for Training GANs. In Advances in Neural Information Processing Systems, volume 29. Curran Associates, Inc., 2016.
  29. Rethinking the Inception Architecture for Computer Vision. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 2818–2826, June 2016.
  30. The Fréchet distance between multivariate normal distributions. Journal of Multivariate Analysis, 12(3):450–455, September 1982.
  31. Ali Borji. Pros and cons of GAN evaluation measures: New developments. Computer Vision and Image Understanding, 215:103329, January 2022.
  32. Demystifying MMD GANs. In International Conference on Learning Representations, May 2023.
  33. Effectively Unbiased FID and Inception Score and Where to Find Them. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6070–6079, 2020.
  34. Assessing generative models via precision and recall. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 31. Curran Associates, Inc., 2018.
  35. On Aliased Resizing and Surprising Subtleties in GAN Evaluation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11410–11420, 2022.
  36. On Self-Supervised Image Representations for GAN Evaluation. In International Conference on Learning Representations, January 2021.
  37. StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 2085–2094, 2021.
  38. Diffusers: State-of-the-art diffusion models, May 2023.
  39. Unsupervised Learning of Visual Features by Contrasting Cluster Assignments. In Advances in Neural Information Processing Systems, volume 33, pages 9912–9924. Curran Associates, Inc., 2020.
  40. FairFace: Face Attribute Dataset for Balanced Race, Gender, and Age for Bias Measurement and Mitigation. In 2021 IEEE Winter Conference on Applications of Computer Vision (WACV), pages 1547–1557, January 2021.
  41. Label-Efficient Semantic Segmentation with Diffusion Models. In International Conference on Learning Representations, January 2022.
  42. ArcFace: Additive Angular Margin Loss for Deep Face Recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4690–4699, 2019.
  43. Marco Marchesi. Megapixel size image creation using generative adversarial networks, 2017.
  44. Large scale GAN training for high fidelity natural image synthesis. In International Conference on Learning Representations, 2019.
  45. Very deep convolutional networks for large-scale image recognition. In International Conference on Learning Representations, 2015.
  46. MediaPipe: A Framework for Building Perception Pipelines, June 2019.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Krish Kabra (4 papers)
  2. Guha Balakrishnan (42 papers)

Summary

We haven't generated a summary for this paper yet.