Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Regeneration Based Training-free Attribution of Fake Images Generated by Text-to-Image Generative Models (2403.01489v1)

Published 3 Mar 2024 in cs.CV and cs.AI

Abstract: Text-to-image generative models have recently garnered significant attention due to their ability to generate images based on prompt descriptions. While these models have shown promising performance, concerns have been raised regarding the potential misuse of the generated fake images. In response to this, we have presented a simple yet effective training-free method to attribute fake images generated by text-to-image models to their source models. Given a test image to be attributed, we first inverse the textual prompt of the image, and then put the reconstructed prompt into different candidate models to regenerate candidate fake images. By calculating and ranking the similarity of the test image and the candidate images, we can determine the source of the image. This attribution allows model owners to be held accountable for any misuse of their models. Note that our approach does not limit the number of candidate text-to-image generative models. Comprehensive experiments reveal that (1) Our method can effectively attribute fake images to their source models, achieving comparable attribution performance with the state-of-the-art method; (2) Our method has high scalability ability, which is well adapted to real-world attribution scenarios. (3) The proposed method yields satisfactory robustness to common attacks, such as Gaussian blurring, JPEG compression, and Resizing. We also analyze the factors that influence the attribution performance, and explore the boost brought by the proposed method as a plug-in to improve the performance of existing SOTA. We hope our work can shed some light on the solutions to addressing the source of AI-generated images, as well as to prevent the misuse of text-to-image generative models.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (31)
  1. [n. d.]. https://github.com/pharmapsychotic/clip-interrogator.
  2. Repmix: Representation mixing for robust attribution of synthesized images. In European Conference on Computer Vision. Springer, 146–163.
  3. DALL·E Mini. https://doi.org/10.5281/zenodo.5146400
  4. Taming transformers for high-resolution image synthesis. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 12873–12883.
  5. Detecting deepfake videos using attribution-based confidence metric. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. 308–309.
  6. An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion. In The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net. https://openreview.net/pdf?id=NAQvF08TcyG
  7. Towards discovery and attribution of open-world gan generated images. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 14094–14103.
  8. Did You Use My GAN to Generate Fake? Post-hoc Attribution of GAN Generated Images via Latent Recovery. In 2022 International Joint Conference on Neural Networks (IJCNN). IEEE, 1–8.
  9. Model attribution of face-swap deepfake videos. In 2022 IEEE International Conference on Image Processing (ICIP). IEEE, 2356–2360.
  10. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5-10, 2020, Dan Jurafsky, Joyce Chai, Natalie Schluter, and Joel R. Tetreault (Eds.). Association for Computational Linguistics, 7871–7880. https://doi.org/10.18653/V1/2020.ACL-MAIN.703
  11. BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation. In International Conference on Machine Learning, ICML 2022, 17-23 July 2022, Baltimore, Maryland, USA (Proceedings of Machine Learning Research, Vol. 162), Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesvári, Gang Niu, and Sivan Sabato (Eds.). PMLR, 12888–12900. https://proceedings.mlr.press/v162/li22n.html
  12. Microsoft coco: Common objects in context. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13. Springer, 740–755.
  13. Prompting Hard or Hardly Prompting: Prompt Inversion for Text-to-Image Diffusion Models. CoRR abs/2312.12416 (2023). https://doi.org/10.48550/ARXIV.2312.12416 arXiv:2312.12416
  14. Clipcap: Clip prefix for image captioning. arXiv preprint arXiv:2111.09734 (2021).
  15. GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models. In International Conference on Machine Learning, ICML 2022, 17-23 July 2022, Baltimore, Maryland, USA (Proceedings of Machine Learning Research, Vol. 162), Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesvári, Gang Niu, and Sivan Sabato (Eds.). PMLR, 16784–16804. https://proceedings.mlr.press/v162/nichol22a.html
  16. Flickr30k entities: Collecting region-to-phrase correspondences for richer image-to-sentence models. In Proceedings of the IEEE international conference on computer vision. 2641–2649.
  17. Learning Transferable Visual Models From Natural Language Supervision. In Proceedings of the 38th International Conference on Machine Learning, ICML 2021, 18-24 July 2021, Virtual Event (Proceedings of Machine Learning Research, Vol. 139), Marina Meila and Tong Zhang (Eds.). PMLR, 8748–8763. http://proceedings.mlr.press/v139/radford21a.html
  18. Hierarchical text-conditional image generation with clip latents. arXiv preprint arXiv:2204.06125 1, 2 (2022), 3.
  19. Zero-shot text-to-image generation. In International Conference on Machine Learning. PMLR, 8821–8831.
  20. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 10684–10695.
  21. Laion-400m: Open dataset of clip-filtered 400 million image-text pairs. arXiv preprint arXiv:2111.02114 (2021).
  22. De-fake: Detection and attribution of fake images generated by text-to-image generation models. In Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security. 3418–3432.
  23. Prompt Stealing Attacks Against Text-to-Image Generation Models. CoRR abs/2302.09923 (2023). https://doi.org/10.48550/ARXIV.2302.09923 arXiv:2302.09923
  24. Contrastive Pseudo Learning for Open-World DeepFake Attribution. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 20882–20892.
  25. P+: Extended Textual Conditioning in Text-to-Image Generation. CoRR abs/2303.09522 (2023). https://doi.org/10.48550/ARXIV.2303.09522 arXiv:2303.09522
  26. Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing 13, 4 (2004), 600–612. https://doi.org/10.1109/TIP.2003.819861
  27. Hard Prompts Made Easy: Gradient-Based Discrete Optimization for Prompt Tuning and Discovery. CoRR abs/2302.03668 (2023). https://doi.org/10.48550/ARXIV.2302.03668 arXiv:2302.03668
  28. Hard prompts made easy: Gradient-based discrete optimization for prompt tuning and discovery. arXiv preprint arXiv:2302.03668 (2023).
  29. Deepfake network architecture attribution. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 4662–4670.
  30. Progressive Open Space Expansion for Open-Set Model Attribution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 15856–15865.
  31. Attributing fake images to gans: Learning and analyzing gan fingerprints. In Proceedings of the IEEE/CVF international conference on computer vision. 7556–7566.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Meiling Li (7 papers)
  2. Zhenxing Qian (54 papers)
  3. Xinpeng Zhang (86 papers)
Citations (1)