Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Robust CLIP-Based Detector for Exposing Diffusion Model-Generated Images (2404.12908v1)

Published 19 Apr 2024 in cs.CV, cs.LG, and eess.IV

Abstract: Diffusion models (DMs) have revolutionized image generation, producing high-quality images with applications spanning various fields. However, their ability to create hyper-realistic images poses significant challenges in distinguishing between real and synthetic content, raising concerns about digital authenticity and potential misuse in creating deepfakes. This work introduces a robust detection framework that integrates image and text features extracted by CLIP model with a Multilayer Perceptron (MLP) classifier. We propose a novel loss that can improve the detector's robustness and handle imbalanced datasets. Additionally, we flatten the loss landscape during the model training to improve the detector's generalization capabilities. The effectiveness of our method, which outperforms traditional detection techniques, is demonstrated through extensive experiments, underscoring its potential to set a new state-of-the-art approach in DM-generated image detection. The code is available at https://github.com/Purdue-M2/Robust_DM_Generated_Image_Detection.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (61)
  1. Jonathan Ho et al. Denoising diffusion probabilistic models. neurips, 33, 2020.
  2. Yang Song et al. Score-based generative modeling through stochastic differential equations. arXiv, 2020.
  3. Conditional diffusion models for semantic 3d brain mri synthesis. 2024.
  4. Mvd-fusion: Single-view 3d via depth-consistent multi-view generation. arXiv, 2024.
  5. Siyuan Mei et al. Segmentation-guided knee radiograph generation using conditional diffusion models. arXiv, 2024.
  6. Future-proofing class incremental learning. arXiv, 2024.
  7. Would deep generative models amplify bias in future models? arXiv, 2024.
  8. Dongzhi Jiang et al. Comat: Aligning text-to-image diffusion model with image-to-text concept matching. arXiv, 2024.
  9. Rinon Gal et al. Lcm-lookahead for encoder-based text-to-image personalization. arXiv, 2024.
  10. Raising the bar of ai-generated image detection with clip. arXiv, 2023.
  11. De-fake: Detection and attribution of fake images generated by text-to-image generation models. In Proceedings of the 2023 ACM SIGSAC CCS, 2023.
  12. Mastering deepfake detection: A cutting-edge approach to distinguish gan and diffusion-model images. ACM TOMM, 2024.
  13. Siwei Lyu. Deepfake detection: Current challenges and next steps. arXiv, 2020.
  14. Andreas Rossler et al. Faceforensics++: Learning to detect manipulated facial images. In CVPR, 2019.
  15. Detection of gan-generated fake images over social networks. In MIPR, 2018.
  16. Bi-lora: A vision-language approach for synthetic image detection. arXiv, 2024.
  17. Wei Shen and Li Liu. Lare2: Latent reconstruction error based method for diffusion-generated image detection. arXiv.
  18. Clipping the deception: Adapting vision-language models for universal deepfake detection. arXiv, 2024.
  19. Towards the detection of ai-synthesized human face images. arXiv, 2024.
  20. Hany Farid. Lighting (in)consistency of paint by text. arXiv, 2022.
  21. Open-eye: An open platform to study human performance on identifying ai-synthesized faces. In MIPR, 2022.
  22. Eyes tell all: Irregular pupil shapes reveal gan-generated faces. In ICASSP, 2022.
  23. Exposing gan-generated faces using inconsistent corneal specular highlights. In ICASSP, 2021.
  24. Conditional diffusion models for semantic 3d brain mri synthesis. arXiv, 2024.
  25. Detecting multimedia generated by large ai models: A survey. arXiv, 2024.
  26. Synthesizing black-box anti-forensics deepfakes with high visual quality. In ICASSP. IEEE, 2024.
  27. X-transfer: A transfer learning-based framework for robust gan-generated fake image detection. arXiv, 2023.
  28. Improving cross-dataset deepfake detection with deep information decomposition. arXiv, 2023.
  29. Attacking identity semantics in deepfakes via deep feature fusion. In MIPR, 2023.
  30. Gan-generated faces detection: A survey and new perspectives. ECAI, 2022.
  31. Alec Radford et al. Learning transferable visual models from natural language supervision, 2021.
  32. Robin Rombach et al. High-resolution image synthesis with latent diffusion models. In CVPR, 2022.
  33. Hierarchical text-conditional image generation with clip latents. arXiv, 2022.
  34. Chitwan Saharia et al. Photorealistic text-to-image diffusion models with deep language understanding. neurips, 2022.
  35. Diffusion models beat gans on image synthesis. neurips, 2021.
  36. Improved denoising diffusion probabilistic models. 2021.
  37. Midjourney. https://mid-journey.ai/.
  38. Ali Borji. Qualitative failures of image generation models and their application in detecting deepfakes. Image Vis. Comput., 2023.
  39. Hany Farid. Perspective (in) consistency of paint by text. arXiv, 2022.
  40. Deep image fingerprint: Towards low budget synthetic image detection. arXiv, 2023.
  41. Exposing the fake: Effective diffusion-generated images detection. arXiv, 2023.
  42. Densely connected convolutional networks. In CVPR, 2017.
  43. Rethinking the inception architecture for computer vision. In CVPR, 2016.
  44. François Chollet. Xception: Deep learning with depthwise separable convolutions. In CVPR, 2017.
  45. Robust covid-19 detection in ct images with clip. arXiv, 2024.
  46. Li Lin et al. Robust light-weight facial affective behavior recognition with clip. arXiv, 2024.
  47. Preserving fairness generalization in deepfake detection. CVPR, 2024.
  48. Improving fairness in deepfake detection. In WACV, 2024.
  49. Outlier robust adversarial training. In ACML, 2024.
  50. Rank-based decomposable losses in machine learning: A survey. TPAMI, 2023.
  51. Distributionally robust survival analysis: A novel fairness loss without demographics. In Machine Learning for Health, 2022.
  52. Tkml-ap: Adversarial attacks to top-k multi-label learning. In ICCV, 2021.
  53. Sum of ranked range loss for supervised learning. Journal of Machine Learning Research, 2022.
  54. Learning by minimizing the sum of ranked range. neurips, 2020.
  55. Wenbo Pu et al. Learning a deep dual-level network for robust deepfake detection. Pattern Recognition, 2022.
  56. Robust attentive deep neural network for detecting gan-generated faces. IEEE Access, 2022.
  57. Sharpness-aware minimization for efficiently improving generalization. In ICLR, 2020.
  58. Diffusion-generated deepfake detection datase. In https://huggingface.co/datasets/elsaEU/ELSA_D3.
  59. Christoph Schuhmann et al. Laion-400m: Open dataset of clip-filtered 400 million image-text pairs. arXiv, 2021.
  60. S. Chen and J. Lee. Stable diffusion xl: Enhancing the resolution and quality of ai-generated images. Journal of Artificial Intelligence Research, 2023.
  61. R. Floyd et al. Deepfloyd-if: A novel image forensics method for identifying deepfake images. Journal of Digital Forensics, Security and Law, 2023.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Santosh (2 papers)
  2. Li Lin (91 papers)
  3. Irene Amerini (22 papers)
  4. Xin Wang (1306 papers)
  5. Shu Hu (63 papers)
Citations (4)