Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
51 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

GenImage: A Million-Scale Benchmark for Detecting AI-Generated Image (2306.08571v2)

Published 14 Jun 2023 in cs.CV

Abstract: The extraordinary ability of generative models to generate photographic images has intensified concerns about the spread of disinformation, thereby leading to the demand for detectors capable of distinguishing between AI-generated fake images and real images. However, the lack of large datasets containing images from the most advanced image generators poses an obstacle to the development of such detectors. In this paper, we introduce the GenImage dataset, which has the following advantages: 1) Plenty of Images, including over one million pairs of AI-generated fake images and collected real images. 2) Rich Image Content, encompassing a broad range of image classes. 3) State-of-the-art Generators, synthesizing images with advanced diffusion models and GANs. The aforementioned advantages allow the detectors trained on GenImage to undergo a thorough evaluation and demonstrate strong applicability to diverse images. We conduct a comprehensive analysis of the dataset and propose two tasks for evaluating the detection method in resembling real-world scenarios. The cross-generator image classification task measures the performance of a detector trained on one generator when tested on the others. The degraded image classification task assesses the capability of the detectors in handling degraded images such as low-resolution, blurred, and compressed images. With the GenImage dataset, researchers can effectively expedite the development and evaluation of superior AI-generated image detectors in comparison to prevailing methodologies.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (28)
  1. O’Sullivan, D.; Passantino, J. ’Verified’ Twitter accounts share fake image of ‘explosion’ near Pentagon, causing confusion.
  2. Lu, Z.; Huang, D.; Bai, L.; Liu, X.; Qu, J.; Ouyang, W. Seeing is not always believing: A Quantitative Study on Human Perception of AI-Generated Images. arXiv preprint arXiv:2304.13023 2023,
  3. Yang, X.; Li, Y.; Lyu, S. Exposing deep fakes using inconsistent head poses. ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2019; pp 8261–8265.
  4. Wang, R.; Juefei-Xu, F.; Ma, L.; Xie, X.; Huang, Y.; Wang, J.; Liu, Y. Fakespotter: A simple yet robust baseline for spotting ai-synthesized fake faces. arXiv preprint arXiv:1909.06122 2019,
  5. Dang, H.; Liu, F.; Stehouwer, J.; Liu, X.; Jain, A. K. On the detection of digital face manipulation. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern recognition. 2020; pp 5781–5790.
  6. Gandhi, A.; Jain, S. Adversarial perturbations fool deepfake detectors. 2020 international joint conference on neural networks (IJCNN). 2020; pp 1–8.
  7. He, Y.; Gan, B.; Chen, S.; Zhou, Y.; Yin, G.; Song, L.; Sheng, L.; Shao, J.; Liu, Z. Forgerynet: A versatile benchmark for comprehensive forgery analysis. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2021; pp 4360–4369.
  8. Wang, Y.; Huang, Z.; Hong, X. Benchmarking Deepart Detection. arXiv preprint arXiv:2302.14475 2023,
  9. Wang, S.-Y.; Wang, O.; Zhang, R.; Owens, A.; Efros, A. A. CNN-generated images are surprisingly easy to spot… for now. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020; pp 8695–8704.
  10. Verdoliva, L.; Cozzolino, D.; Nagano, K. 2022 IEEE Image and Video Processing Cup Synthetic Image Detection.
  11. Sha, Z.; Li, Z.; Yu, N.; Zhang, Y. DE-FAKE: Detection and Attribution of Fake Images Generated by Text-to-Image Diffusion Models. arXiv preprint arXiv:2210.06998 2022,
  12. Bird, J. J.; Lotfi, A. CIFAKE: Image Classification and Explainable Identification of AI-Generated Synthetic Images. arXiv preprint arXiv:2303.14126 2023,
  13. Krizhevsky, A.; Hinton, G., et al. Learning multiple layers of features from tiny images. 2009,
  14. Deng, J.; Dong, W.; Socher, R.; Li, L.-J.; Li, K.; Fei-Fei, L. Imagenet: A large-scale hierarchical image database. 2009 IEEE conference on computer vision and pattern recognition. 2009; pp 248–255.
  15. Midjourney, https://www.midjourney.com/home/. 2022.
  16. Rombach, R.; Blattmann, A.; Lorenz, D.; Esser, P.; Ommer, B. High-Resolution Image Synthesis With Latent Diffusion Models. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2022; pp 10684–10695.
  17. Brock, A.; Donahue, J.; Simonyan, K. Large scale GAN training for high fidelity natural image synthesis. arXiv preprint arXiv:1809.11096 2018,
  18. Nichol, A.; Dhariwal, P.; Ramesh, A.; Shyam, P.; Mishkin, P.; McGrew, B.; Sutskever, I.; Chen, M. Glide: Towards photorealistic image generation and editing with text-guided diffusion models. arXiv preprint arXiv:2112.10741 2021,
  19. Gu, S.; Chen, D.; Bao, J.; Wen, F.; Zhang, B.; Chen, D.; Yuan, L.; Guo, B. Vector quantized diffusion model for text-to-image synthesis. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022; pp 10696–10706.
  20. Wukong, https://xihe.mindspore.cn/modelzoo/wukong. 2022.
  21. He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. Proceedings of the IEEE conference on computer vision and pattern recognition. 2016; pp 770–778.
  22. Touvron, H.; Cord, M.; Douze, M.; Massa, F.; Sablayrolles, A.; Jégou, H. Training data-efficient image transformers & distillation through attention. International conference on machine learning. 2021; pp 10347–10357.
  23. Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin transformer: Hierarchical vision transformer using shifted windows. Proceedings of the IEEE/CVF international conference on computer vision. 2021; pp 10012–10022.
  24. Liu, Z.; Qi, X.; Torr, P. H. Global texture enhancement for fake face detection in the wild. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020; pp 8060–8069.
  25. Zhang, X.; Karaman, S.; Chang, S.-F. Detecting and simulating artifacts in gan fake images. 2019 IEEE international workshop on information forensics and security (WIFS). 2019; pp 1–6.
  26. Huang, G. B.; Jain, V.; Learned-Miller, E. Unsupervised Joint Alignment of Complex Images. ICCV. 2007.
  27. Schuhmann, C.; Beaumont, R.; Vencu, R.; Gordon, C.; Wightman, R.; Cherti, M.; Coombes, T.; Katta, A.; Mullis, C.; Wortsman, M., et al. Laion-5b: An open large-scale dataset for training next generation image-text models. arXiv preprint arXiv:2210.08402 2022,
  28. Wang, Z. J.; Montoya, E.; Munechika, D.; Yang, H.; Hoover, B.; Chau, D. H. DiffusionDB: A Large-Scale Prompt Gallery Dataset for Text-to-Image Generative Models. arXiv:2210.14896 [cs] 2022,
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (10)
  1. Mingjian Zhu (15 papers)
  2. Hanting Chen (52 papers)
  3. Qiangyu Yan (5 papers)
  4. Xudong Huang (8 papers)
  5. Guanyu Lin (9 papers)
  6. Wei Li (1121 papers)
  7. Zhijun Tu (32 papers)
  8. Hailin Hu (16 papers)
  9. Jie Hu (187 papers)
  10. Yunhe Wang (145 papers)
Citations (73)