Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
51 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

PatchCraft: Exploring Texture Patch for Efficient AI-generated Image Detection (2311.12397v3)

Published 21 Nov 2023 in cs.CV

Abstract: Recent generative models show impressive performance in generating photographic images. Humans can hardly distinguish such incredibly realistic-looking AI-generated images from real ones. AI-generated images may lead to ubiquitous disinformation dissemination. Therefore, it is of utmost urgency to develop a detector to identify AI generated images. Most existing detectors suffer from sharp performance drops over unseen generative models. In this paper, we propose a novel AI-generated image detector capable of identifying fake images created by a wide range of generative models. We observe that the texture patches of images tend to reveal more traces left by generative models compared to the global semantic information of the images. A novel Smash&Reconstruction preprocessing is proposed to erase the global semantic information and enhance texture patches. Furthermore, pixels in rich texture regions exhibit more significant fluctuations than those in poor texture regions. Synthesizing realistic rich texture regions proves to be more challenging for existing generative models. Based on this principle, we leverage the inter-pixel correlation contrast between rich and poor texture regions within an image to further boost the detection performance. In addition, we build a comprehensive AI-generated image detection benchmark, which includes 17 kinds of prevalent generative models, to evaluate the effectiveness of existing baselines and our approach. Our benchmark provides a leaderboard for follow-up studies. Extensive experimental results show that our approach outperforms state-of-the-art baselines by a significant margin. Our project: https://fdmas.github.io/AIGCDetect

Definition Search Book Streamline Icon: https://streamlinehq.com
References (53)
  1. Neural voice cloning with a few samples. Advances in neural information processing systems, 31, 2018.
  2. Cnn detection of gan-generated face images based on cross-band co-occurrences analysis. In 2020 IEEE international workshop on information forensics and security (WIFS), pages 1–6. IEEE, 2020.
  3. Large scale gan training for high fidelity natural image synthesis. arXiv preprint arXiv:1809.11096, 2018.
  4. End-to-end reconstruction-classification learning for face forgery detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4113–4122, 2022.
  5. A bayesian-mrf approach for prnu-based image forgery detection. IEEE Transactions on Information Forensics and Security, 9(4):554–567, 2014.
  6. Stargan: Unified generative adversarial networks for multi-domain image-to-image translation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 8789–8797, 2018.
  7. cnn. https://www.cnn.com/2023/05/22/tech/twitter-fake-image-pentagon-explosion/index.html. 2023.
  8. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee, 2009.
  9. Diffusion models beat gans on image synthesis. Advances in neural information processing systems, 34:8780–8794, 2021.
  10. Watch your up-convolution: Cnn based generative deep neural networks are failing to reproduce spectral distributions. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 7890–7899, 2020.
  11. Leveraging frequency analysis for deep fake image recognition. In International conference on machine learning, pages 3247–3258. PMLR, 2020.
  12. Rich models for steganalysis of digital images. IEEE Transactions on information Forensics and Security, 7(3):868–882, 2012.
  13. Generative adversarial networks. Communications of the ACM, 63(11):139–144, 2020.
  14. Are gan generated images easy to detect? a critical analysis of the state-of-the-art. In 2021 IEEE international conference on multimedia and expo (ICME), pages 1–6. IEEE, 2021.
  15. Vector quantized diffusion model for text-to-image synthesis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10696–10706, 2022.
  16. Denoising diffusion probabilistic models. Advances in neural information processing systems, 33:6840–6851, 2020.
  17. Detecting cnn-generated facial images in real-world scenarios. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pages 642–643, 2020.
  18. Fusing global and local features for generalized ai-synthesized image detection. In 2022 IEEE International Conference on Image Processing (ICIP), pages 3465–3469. IEEE, 2022.
  19. Progressive growing of gans for improved quality, stability, and variation. arXiv preprint arXiv:1710.10196, 2017.
  20. A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4401–4410, 2019.
  21. Analyzing and improving the image quality of stylegan. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 8110–8119, 2020.
  22. Deep video portraits. ACM transactions on graphics (TOG), 37(4):1–14, 2018.
  23. Glow: Generative flow with invertible 1x1 convolutions. Advances in neural information processing systems, 31, 2018.
  24. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013.
  25. Detecting generated images by real images. In European Conference on Computer Vision, pages 95–110. Springer, 2022.
  26. Spatial-phase shallow learning: rethinking face forgery detection in frequency domain. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 772–781, 2021.
  27. Large-scale celebfaces attributes (celeba) dataset. Retrieved August, 15(2018):11, 2018.
  28. Global texture enhancement for fake face detection in the wild. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 8060–8069, 2020.
  29. Digital camera identification from sensor pattern noise. IEEE Transactions on Information Forensics and Security, 1(2):205–214, 2006.
  30. Midjourney. https://www.midjourney.com/home/. 2023.
  31. Glide: Towards photorealistic image generation and editing with text-guided diffusion models. arXiv preprint arXiv:2112.10741, 2021.
  32. Towards universal fake image detectors that generalize across generative models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 24480–24489, 2023.
  33. Semantic image synthesis with spatially-adaptive normalization. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2337–2346, 2019.
  34. Learning a deep dual-level network for robust deepfake detection. Pattern Recognition, 130:108832, 2022.
  35. Mirrorgan: Learning text-to-image generation by redescription. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1505–1514, 2019.
  36. Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR, 2021.
  37. Hierarchical text-conditional image generation with clip latents. arXiv preprint arXiv:2204.06125, 1(2):3, 2022.
  38. Generative adversarial text to image synthesis. In International conference on machine learning, pages 1060–1069. PMLR, 2016.
  39. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10684–10695, 2022.
  40. Faceforensics++: Learning to detect manipulated facial images. In Proceedings of the IEEE/CVF international conference on computer vision, pages 1–11, 2019.
  41. Detection of face morphing attacks based on prnu analysis. IEEE Transactions on Biometrics, Behavior, and Identity Science, 1(4):302–317, 2019.
  42. De-fake: Detection and attribution of fake images generated by text-to-image diffusion models. arXiv preprint arXiv:2210.06998, 2022.
  43. Deep unsupervised learning using nonequilibrium thermodynamics. In International conference on machine learning, pages 2256–2265. PMLR, 2015.
  44. Generative modeling by estimating gradients of the data distribution. Advances in neural information processing systems, 32, 2019.
  45. Synthesizing obama: learning lip sync from audio. ACM Transactions on Graphics (ToG), 36(4):1–13, 2017.
  46. Learning on gradients: Generalized artifacts representation for gan-generated images detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12105–12114, 2023.
  47. Cnn-generated images are surprisingly easy to spot… for now. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 8695–8704, 2020.
  48. Dire for diffusion-generated image detection. arXiv preprint arXiv:2303.09295, 2023.
  49. https://www.whichfaceisreal.com/. 2023.
  50. wukong. https://xihe.mindspore.cn/modelzoo/wukong. 2023.
  51. Lsun: Construction of a large-scale image dataset using deep learning with humans in the loop. arXiv preprint arXiv:1506.03365, 2015.
  52. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE international conference on computer vision, pages 2223–2232, 2017.
  53. Genimage: A million-scale benchmark for detecting ai-generated image. arXiv preprint arXiv:2306.08571, 2023.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Nan Zhong (7 papers)
  2. Yiran Xu (28 papers)
  3. Zhenxing Qian (54 papers)
  4. Xinpeng Zhang (86 papers)
  5. Sheng Li (217 papers)
Citations (13)
Github Logo Streamline Icon: https://streamlinehq.com

GitHub