Real, fake and synthetic faces -- does the coin have three sides? (2404.01878v1)
Abstract: With the ever-growing power of generative artificial intelligence, deepfake and artificially generated (synthetic) media have continued to spread online, which creates various ethical and moral concerns regarding their usage. To tackle this, we thus present a novel exploration of the trends and patterns observed in real, deepfake and synthetic facial images. The proposed analysis is done in two parts: firstly, we incorporate eight deep learning models and analyze their performances in distinguishing between the three classes of images. Next, we look to further delve into the similarities and differences between these three sets of images by investigating their image properties both in the context of the entire image as well as in the context of specific regions within the image. ANOVA test was also performed and provided further clarity amongst the patterns associated between the images of the three classes. From our findings, we observe that the investigated deeplearning models found it easier to detect synthetic facial images, with the ViT Patch-16 model performing best on this task with a class-averaged sensitivity, specificity, precision, and accuracy of 97.37%, 98.69%, 97.48%, and 98.25%, respectively. This observation was supported by further analysis of various image properties. We saw noticeable differences across the three category of images. This analysis can help us build better algorithms for facial image generation, and also shows that synthetic, deepfake and real face images are indeed three different classes.
- I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative Adversarial Networks,” Jun. 2014, arXiv:1406.2661 [cs, stat]. [Online]. Available: http://arxiv.org/abs/1406.2661
- R. Tolosana, R. Vera-Rodriguez, J. Fierrez, A. Morales, and J. Ortega-Garcia, “DeepFakes and Beyond: A Survey of Face Manipulation and Fake Detection,” Jun. 2020, arXiv:2001.00179 [cs]. [Online]. Available: http://arxiv.org/abs/2001.00179
- M. Masood, M. Nawaz, K. M. Malik, A. Javed, and A. Irtaza, “Deepfakes Generation and Detection: State-of-the-art, open challenges, countermeasures, and way forward,” Nov. 2021, arXiv:2103.00484 [cs, eess]. [Online]. Available: http://arxiv.org/abs/2103.00484
- J. W. Seow, M. K. Lim, R. C. Phan, and J. K. Liu, “A comprehensive overview of deepfake: Generation, detection, datasets, and opportunities,” Neurocomputing, vol. 513, pp. 351–371, 2022.
- M. Ivanovska and V. Struc, “On the vulnerability of deepfake detectors to attacks generated by denoising diffusion models,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) Workshops, January 2024, pp. 1051–1060.
- T. Brooks, B. Peebles, C. Holmes, W. DePue, Y. Guo, L. Jing, D. Schnurr, J. Taylor, T. Luhman, E. Luhman, C. Ng, R. Wang, and A. Ramesh, “Video generation models as world simulators,” 2024. [Online]. Available: https://openai.com/research/video-generation-models-as-world-simulators
- W. Xia, Y. Yang, J.-H. Xue, and B. Wu, “TediGAN: Text-Guided Diverse Face Image Generation and Manipulation,” Mar. 2021, arXiv:2012.03308 [cs]. [Online]. Available: http://arxiv.org/abs/2012.03308
- U. Kosarkar, G. Sarkarkar, and S. Gedam, “Revealing and classification of deepfakes video’s images using a customize convolution neural network model,” Procedia Computer Science, vol. 218, pp. 2636–2652, 2023, international Conference on Machine Learning and Data Engineering. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1877050923002375
- Y. Choi, M. Choi, M. Kim, J.-W. Ha, S. Kim, and J. Choo, “StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation,” Sep. 2018, arXiv:1711.09020 [cs]. [Online]. Available: http://arxiv.org/abs/1711.09020
- M. Li, W. Zuo, and D. Zhang, “Deep Identity-aware Transfer of Facial Attributes,” Dec. 2018, arXiv:1610.05586 [cs]. [Online]. Available: http://arxiv.org/abs/1610.05586
- G. Perarnau, J. van de Weijer, B. Raducanu, and J. M. Álvarez, “Invertible Conditional GANs for image editing,” Nov. 2016, arXiv:1611.06355 [cs]. [Online]. Available: http://arxiv.org/abs/1611.06355
- J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks,” in 2017 IEEE International Conference on Computer Vision (ICCV). Venice: IEEE, Oct. 2017, pp. 2242–2251. [Online]. Available: http://ieeexplore.ieee.org/document/8237506/
- Z. He, W. Zuo, M. Kan, S. Shan, and X. Chen, “AttGAN: Facial Attribute Editing by Only Changing What You Want,” Jul. 2018, arXiv:1711.10678 [cs, stat]. [Online]. Available: http://arxiv.org/abs/1711.10678
- A. B. L. Larsen, S. K. Sønderby, H. Larochelle, and O. Winther, “Autoencoding beyond pixels using a learned similarity metric,” Feb. 2016, arXiv:1512.09300 [cs, stat]. [Online]. Available: http://arxiv.org/abs/1512.09300
- G. Lample, N. Zeghidour, N. Usunier, A. Bordes, L. Denoyer, and M. Ranzato, “Fader Networks: Manipulating Images by Sliding Attributes,” Jan. 2018, arXiv:1706.00409 [cs]. [Online]. Available: http://arxiv.org/abs/1706.00409
- W. Shen and R. Liu, “Learning Residual Images for Face Attribute Manipulation,” Apr. 2017, arXiv:1612.05363 [cs]. [Online]. Available: http://arxiv.org/abs/1612.05363
- A. V and P. T. Joy, “Deepfake Detection Using XceptionNet,” in 2023 IEEE International Conference on Recent Advances in Systems Science and Engineering (RASSE), Nov. 2023, pp. 1–5. [Online]. Available: https://ieeexplore.ieee.org/document/10363477
- A. Rössler, D. Cozzolino, L. Verdoliva, C. Riess, J. Thies, and M. Nießner, “FaceForensics++: Learning to Detect Manipulated Facial Images,” Aug. 2019, arXiv:1901.08971 [cs]. [Online]. Available: http://arxiv.org/abs/1901.08971
- H. Ilyas, A. Javed, M. M. Aljasem, and M. Alhababi, “Fused Swish-ReLU Efficient-Net Model for Deepfakes Detection,” in 2023 9th International Conference on Automation, Robotics and Applications (ICARA). Abu Dhabi, United Arab Emirates: IEEE, Feb. 2023, pp. 368–372. [Online]. Available: https://ieeexplore.ieee.org/document/10125801/
- B. Dolhansky, R. Howes, B. Pflaum, N. Baram, and C. C. Ferrer, “The Deepfake Detection Challenge (DFDC) Preview Dataset,” Oct. 2019, arXiv:1910.08854 [cs]. [Online]. Available: http://arxiv.org/abs/1910.08854
- L. Guarnera, O. Giudice, and S. Battiato, “DeepFake Detection by Analyzing Convolutional Traces,” Apr. 2020, arXiv:2004.10448 [cs]. [Online]. Available: http://arxiv.org/abs/2004.10448
- W. Cho, S. Choi, D. K. Park, I. Shin, and J. Choo, “Image-to-Image Translation via Group-wise Deep Whitening-and-Coloring Transformation,” Jun. 2019, arXiv:1812.09912 [cs]. [Online]. Available: http://arxiv.org/abs/1812.09912
- T. Karras, S. Laine, and T. Aila, “A Style-Based Generator Architecture for Generative Adversarial Networks,” Mar. 2019, arXiv:1812.04948 [cs, stat]. [Online]. Available: http://arxiv.org/abs/1812.04948
- T. Karras, S. Laine, M. Aittala, J. Hellsten, J. Lehtinen, and T. Aila, “Analyzing and Improving the Image Quality of StyleGAN,” Mar. 2020, arXiv:1912.04958 [cs, eess, stat]. [Online]. Available: http://arxiv.org/abs/1912.04958
- Z. Cai, S. Ghosh, K. Stefanov, A. Dhall, J. Cai, H. Rezatofighi, R. Haffari, and M. Hayat, “MARLIN: Masked Autoencoder for facial video Representation LearnINg,” Mar. 2023, arXiv:2211.06627 [cs]. [Online]. Available: http://arxiv.org/abs/2211.06627
- S. Reed, Z. Akata, X. Yan, L. Logeswaran, B. Schiele, and H. Lee, “Generative Adversarial Text to Image Synthesis,” Jun. 2016, arXiv:1605.05396 [cs]. [Online]. Available: http://arxiv.org/abs/1605.05396
- H. Zhang, T. Xu, H. Li, S. Zhang, X. Wang, X. Huang, and D. Metaxas, “Stackgan: Text to photo-realistic image synthesis with stacked generative adversarial networks,” 2017.
- S. Reed, Z. Akata, S. Mohan, S. Tenka, B. Schiele, and H. Lee, “Learning what and where to draw,” 2016.
- T. Xu, P. Zhang, Q. Huang, H. Zhang, Z. Gan, X. Huang, and X. He, “Attngan: Fine-grained text to image generation with attentional generative adversarial networks,” 2017.
- B. Li, X. Qi, T. Lukasiewicz, and P. H. S. Torr, “Controllable text-to-image generation,” 2019.
- M. Tao, H. Tang, F. Wu, X.-Y. Jing, B.-K. Bao, and C. Xu, “Df-gan: A simple and effective baseline for text-to-image synthesis,” 2022.
- M. Zhu, P. Pan, W. Chen, and Y. Yang, “Dm-gan: Dynamic memory generative adversarial networks for text-to-image synthesis,” 2019.
- N. Waqas, S. I. Safie, K. A. Kadir, S. Khan, and M. H. Kaka Khel, “DEEPFAKE Image Synthesis for Data Augmentation,” IEEE Access, vol. 10, pp. 80 847–80 857, 2022. [Online]. Available: https://ieeexplore.ieee.org/document/9839427/
- D. Liang, R. Wang, X. Tian, and C. Zou, “PCGAN: Partition-Controlled Human Image Generation,” Nov. 2018, arXiv:1811.09928 [cs]. [Online]. Available: http://arxiv.org/abs/1811.09928
- J. Liang, X. Yang, H. Li, Y. Wang, M. T. Van, H. Dou, C. Chen, J. Fang, X. Liang, Z. Mai, G. Zhu, Z. Chen, and D. Ni, “Synthesis and Edition of Ultrasound Images via Sketch Guided Progressive Growing GANs,” Apr. 2020, arXiv:2004.00226 [cs, eess]. [Online]. Available: http://arxiv.org/abs/2004.00226
- R. Wang, F. Juefei-Xu, L. Ma, X. Xie, Y. Huang, J. Wang, and Y. Liu, “FakeSpotter: A Simple yet Robust Baseline for Spotting AI-Synthesized Fake Faces,” Jul. 2020, arXiv:1909.06122 [cs]. [Online]. Available: http://arxiv.org/abs/1909.06122
- L. Lin, N. Gupta, Y. Zhang, H. Ren, C.-H. Liu, F. Ding, X. Wang, X. Li, L. Verdoliva, and S. Hu, “Detecting Multimedia Generated by Large AI Models: A Survey,” Feb. 2024, arXiv:2402.00045 [cs]. [Online]. Available: http://arxiv.org/abs/2402.00045
- D. Beniaguev, “Synthetic faces high quality (sfhq) part 1,” 2022. [Online]. Available: https://www.kaggle.com/dsv/4737549
- XHLULU, “140k real and fake faces,” 2020. [Online]. Available: https://www.kaggle.com/datasets/xhlulu/140k-real-and-fake-faces
- B. Tunguz, “1 million fake faces - 1,” 2020. [Online]. Available: https://www.kaggle.com/datasets/tunguz/1-million-fake-faces/data
- G. Jocher, A. Chaurasia, and J. Qiu, “Ultralytics YOLO,” Jan. 2023. [Online]. Available: https://github.com/ultralytics/ultralytics
- T.-Y. Lin, M. Maire, S. Belongie, L. Bourdev, R. Girshick, J. Hays, P. Perona, D. Ramanan, C. L. Zitnick, and P. Dollár, “Microsoft coco: Common objects in context,” 2015.
- A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby, “An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale,” Jun. 2021, arXiv:2010.11929 [cs]. [Online]. Available: http://arxiv.org/abs/2010.11929
- G. Huang, Z. Liu, L. van der Maaten, and K. Q. Weinberger, “Densely Connected Convolutional Networks,” Jan. 2018, arXiv:1608.06993 [cs]. [Online]. Available: http://arxiv.org/abs/1608.06993
- K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition,” Dec. 2015, arXiv:1512.03385 [cs]. [Online]. Available: http://arxiv.org/abs/1512.03385
- C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, “Rethinking the Inception Architecture for Computer Vision,” Dec. 2015, arXiv:1512.00567 [cs]. [Online]. Available: http://arxiv.org/abs/1512.00567
- M. Tan and Q. V. Le, “EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks,” Sep. 2020, arXiv:1905.11946 [cs, stat]. [Online]. Available: http://arxiv.org/abs/1905.11946
- K. Simonyan and A. Zisserman, “Very Deep Convolutional Networks for Large-Scale Image Recognition,” Apr. 2015, arXiv:1409.1556 [cs]. [Online]. Available: http://arxiv.org/abs/1409.1556
- N. Ma, X. Zhang, H.-T. Zheng, and J. Sun, “ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design,” Jul. 2018, arXiv:1807.11164 [cs]. [Online]. Available: http://arxiv.org/abs/1807.11164
- M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, “MobileNetV2: Inverted Residuals and Linear Bottlenecks,” Mar. 2019, arXiv:1801.04381 [cs]. [Online]. Available: http://arxiv.org/abs/1801.04381
- J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “ImageNet: A large-scale hierarchical image database,” in 2009 IEEE Conference on Computer Vision and Pattern Recognition, Jun. 2009, pp. 248–255, iSSN: 1063-6919. [Online]. Available: https://ieeexplore.ieee.org/document/5206848
- Shahzeb Naeem (2 papers)
- Ramzi Al-Sharawi (1 paper)
- Muhammad Riyyan Khan (2 papers)
- Usman Tariq (7 papers)
- Abhinav Dhall (55 papers)
- Hasan Al-Nashash (3 papers)