Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
98 tokens/sec
GPT-4o
11 tokens/sec
Gemini 2.5 Pro Pro
52 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
15 tokens/sec
DeepSeek R1 via Azure Pro
33 tokens/sec
Gemini 2.5 Flash Deprecated
12 tokens/sec
2000 character limit reached

Synthetic Image Detection: Highlights from the IEEE Video and Image Processing Cup 2022 Student Competition (2309.12428v1)

Published 21 Sep 2023 in cs.CV

Abstract: The Video and Image Processing (VIP) Cup is a student competition that takes place each year at the IEEE International Conference on Image Processing. The 2022 IEEE VIP Cup asked undergraduate students to develop a system capable of distinguishing pristine images from generated ones. The interest in this topic stems from the incredible advances in the AI-based generation of visual data, with tools that allows the synthesis of highly realistic images and videos. While this opens up a large number of new opportunities, it also undermines the trustworthiness of media content and fosters the spread of disinformation on the internet. Recently there was strong concern about the generation of extremely realistic images by means of editing software that includes the recent technology on diffusion models. In this context, there is a need to develop robust and automatic tools for synthetic image detection.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (39)
  1. A. Mahdawi, “Nonconsensual deepfake porn is an emergency that is ruining lives,” https://www.theguardian.com/commentisfree/2023/apr/01/ai-deepfake-porn-fake-images, 2023.
  2. J. Vincent, “After deepfakes go viral, AI image generator Midjourney stops free trials citing ‘abuse’,” https://www.theverge.com/2023/3/30/23662940/deepfake-viral-ai-misinformation-midjourney-stops-free-trials, 2023.
  3. L. Verdoliva, “Media forensics and deepfakes: an overview,” IEEE Journal of Selected Topics in Signal Processing, vol. 14, no. 5, pp. 910–932, 2020.
  4. T. Karras, S. Laine, and T. Aila, “A style-based generator architecture for generative adversarial networks,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 4396–4405.
  5. A. Q. Nichol, P. Dhariwal, A. Ramesh, P. Shyam, P. Mishkin, B. Mcgrew, I. Sutskever, and M. Chen, “GLIDE: Towards photorealistic image generation and editing with text-guided diffusion models,” in International Conference on Machine Learning.   PMLR, 2022, pp. 16 784–16 804.
  6. B. Dayma, S. Patil, P. Cuenca, K. Saifullah, T. Abraham, P. Lê Khàc, L. Melas, and R. Ghosh, “DALL-E Mini,” 2021. [Online]. Available: https://github.com/borisdayma/dalle-mini
  7. R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, “High-resolution image synthesis with latent diffusion models,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 10 684–10 695.
  8. Y. Balaji, S. Nah, X. Huang, A. Vahdat, J. Song, K. Kreis, M. Aittala, T. Aila, S. Laine, B. Catanzaro, T. Karras, and M.-Y. Liu, “eDiff-I: Text-to-Image Diffusion Models with Ensemble of Expert Denoisers,” arXiv preprint arXiv:2211.01324, 2022.
  9. D. Gragnaniello, D. Cozzolino, F. Marra, G. Poggi, and L. Verdoliva, “Are gan generated images easy to detect? a critical analysis of the state-of-the-art,” in IEEE International Conference on Multimedia and Expo, 2021, pp. 1–6.
  10. P. Esser, R. Rombach, and B. Ommer, “Taming transformers for high-resolution image synthesis,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 12 873–12 883.
  11. T. Karras, S. Laine, M. Aittala, J. Hellsten, J. Lehtinen, and T. Aila, “Analyzing and improving the image quality of StyleGAN,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 8110–8119.
  12. T. Karras, M. Aittala, S. Laine, E. Härkönen, J. Hellsten, J. Lehtinen, and T. Aila, “Alias-free generative adversarial networks,” Advances in Neural Information Processing Systems, vol. 34, pp. 852–863, 2021.
  13. J. Yu, Z. Lin, J. Yang, X. Shen, X. Lu, and T. S. Huang, “Free-form image inpainting with gated convolution,” in IEEE/CVF International Conference on Computer Vision, 2019, pp. 4471–4480.
  14. A. Brock, J. Donahue, and K. Simonyan, “Large Scale GAN Training for High Fidelity Natural Image Synthesis,” in International Conference on Learning Representations, 2019.
  15. P. Dhariwal and A. Nichol, “Diffusion models beat GANs on image synthesis,” Advances in Neural Information Processing Systems, vol. 34, pp. 8780–8794, 2021.
  16. R. Suvorov, E. Logacheva, A. Mashikhin, A. Remizova, A. Ashukha, A. Silvestrov, N. Kong, H. Goka, K. Park, and V. Lempitsky, “Resolution-robust large mask inpainting with fourier convolutions,” in IEEE/CVF Winter Conference on Applications of Computer Vision, 2022, pp. 2149–2159.
  17. J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Imagenet: A large-scale hierarchical image database,” in IEEE Conference on Computer Vision and Pattern Recognition, 2009, pp. 248–255.
  18. T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick, “Microsoft COCO: Common objects in context,” in European Conference on Computer Vision, 2014, pp. 740–755.
  19. F. Yu, A. Seff, Y. Zhang, S. Song, T. Funkhouser, and J. Xiao, “LSUN: Construction of a large-scale image dataset using deep learning with humans in the loop,” arXiv preprint arXiv:1506.03365, 2015.
  20. H. Li, B. Li, S. Tan, and J. Huang, “Detection of deep network generated images using disparities in color components,” Signal Processing, vol. 174, 2020.
  21. X. Zhang, S. Karaman, and S.-F. Chang, “Detecting and Simulating Artifacts in GAN Fake Images,” in IEEE international Workshop on Information Forensics and Security, 2019, pp. 1–6.
  22. Y. Ju, S. Jia, L. Ke, H. Xue, K. Nagano, and S. Lyu, “Fusing Global and Local Features for Generalized AI-Synthesized Image Detection,” IEEE International Conference on Image Processing, pp. 3465–3469, 2022.
  23. S. Mandelli, N. Bonettini, P. Bestagini, and S. Tubaro, “Detecting GAN-generated Images by Orthogonal Training of Multiple CNNs,” in IEEE International Conference on Image Processing, 2022, pp. 3091–3095.
  24. S.-Y. Wang, O. Wang, R. Zhang, A. Owens, and A. Efros, “CNN-generated images are surprisingly easy to spot… for now,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 8692–8701.
  25. A. Ramesh, M. Pavlov, G. Goh, S. Gray, C. Voss, A. Radford, M. Chen, and I. Sutskever, “Zero-shot text-to-image generation,” in International Conference on Machine Learning.   PMLR, 2021, pp. 8821–8831.
  26. T. Karras, T. Aila, S. Laine, and J. Lehtinen, “Progressive Growing of GANs for Improved Quality, Stability, and Variation,” in International Conference on Learning Representations, 2018.
  27. A. Sauer, K. Chitta, J. Müller, and A. Geiger, “Projected GANs converge faster,” Advances in Neural Information Processing Systems, 2021.
  28. J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired image-to-image translation using cycle-consistent adversarial networks,” in IEEE International Conference on Computer Vision, 2017, pp. 2223–2232.
  29. J. Ho, A. Jain, and P. Abbeel, “Denoising Diffusion Probabilistic Models,” Advances in Neural Information Processing Systems, pp. 6840–6851, 2020.
  30. Z. Wang, H. Zheng, P. He, W. Chen, and M. Zhou, “Diffusion-GAN: Training GANs with Diffusion,” International Conference on Learning Representations, 2023.
  31. R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, “Stable diffusion,” https://github.com/CompVis/stable-diffusion, 2022.
  32. Z. Xiao, K. Kreis, and A. Vahdat, “Tackling the Generative Learning Trilemma with Denoising Diffusion GANs,” in International Conference on Learning Representations, 2022.
  33. T. Park, M.-Y. Liu, T.-C. Wang, and J.-Y. Zhu, “Semantic image synthesis with spatially-adaptive normalization,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 2332–2341.
  34. A. Ramesh, P. Dhariwal, A. Nichol, C. Chu, and M. Chen, “Hierarchical text-conditional image generation with clip latents,” arXiv preprint arXiv:2204.06125v1, 2022.
  35. R. Corvi, D. Cozzolino, G. Zingarini, G. Poggi, K. Nagano, and L. Verdoliva, “On the detection of synthetic images generated by diffusion models,” IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 1–5, 2023.
  36. Z. Sha, Z. Li, N. Yu, and Y. Zhang, “DE-FAKE: Detection and Attribution of Fake Images Generated by Text-to-Image Diffusion Models,” arXiv preprint arXiv:2210.06998, 2022.
  37. J. Ricker, S. Damm, T. Holz, and A. Fischer, “Towards the Detection of Diffusion Model Deepfakes,” arXiv preprint arXiv:2210.14571, 2022.
  38. M. Barni, P. Campisi, E. J. Delp, G. Doërr, J. Fridrich, N. Memon, F. Pérez-González, A. Rocha, L. Verdoliva, and M. Wu, “Information Forensics and Security: A Quarter Century Long Journey,” IEEE Signal Processing Magazine, 2023.
  39. R. Corvi, D. Cozzolino, G. Poggi, K. Nagano, and L. Verdoliva, “Intriguing properties of synthetic images: from generative adversarial networks to diffusion models,” in IEEE Computer Vision and Pattern Recognition Workshops, 2023, pp. 973–982.
Citations (3)

Summary

We haven't generated a summary for this paper yet.