Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Warfare:Breaking the Watermark Protection of AI-Generated Content (2310.07726v3)

Published 27 Sep 2023 in cs.CV and cs.AI

Abstract: AI-Generated Content (AIGC) is gaining great popularity, with many emerging commercial services and applications. These services leverage advanced generative models, such as latent diffusion models and LLMs, to generate creative content (e.g., realistic images and fluent sentences) for users. The usage of such generated content needs to be highly regulated, as the service providers need to ensure the users do not violate the usage policies (e.g., abuse for commercialization, generating and distributing unsafe content). A promising solution to achieve this goal is watermarking, which adds unique and imperceptible watermarks on the content for service verification and attribution. Numerous watermarking approaches have been proposed recently. However, in this paper, we show that an adversary can easily break these watermarking mechanisms. Specifically, we consider two possible attacks. (1) Watermark removal: the adversary can easily erase the embedded watermark from the generated content and then use it freely bypassing the regulation of the service provider. (2) Watermark forging: the adversary can create illegal content with forged watermarks from another user, causing the service provider to make wrong attributions. We propose Warfare, a unified methodology to achieve both attacks in a holistic way. The key idea is to leverage a pre-trained diffusion model for content processing and a generative adversarial network for watermark removal or forging. We evaluate Warfare on different datasets and embedding setups. The results prove that it can achieve high success rates while maintaining the quality of the generated content. Compared to existing diffusion model-based attacks, Warfare is 5,050~11,000x faster.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (55)
  1. Governments race to regulate ai tools. https://www.reuters.com/technology/governments-efforts-regulate-ai-tools-2023-04-12/.
  2. A comprehensive guide to the aigc measures issued by the cac. https://www.lexology.com/library/detail.aspx?g=42ad7be8-76bd-40c8-ae5d-271aaf3710eb.
  3. Instagram. https://www.instagram.com/.
  4. Midjourney. https://docs.midjourney.com/.
  5. Best sites to sell ai art - make profit from day one. https://okuha.com/best-sites-to-sell-ai-art/.
  6. Singapore’s approach to ai governance. https://www.pdpc.gov.sg/help-and-resources/2020/01/model-ai-governance-framework.
  7. Synthid. https://www.deepmind.com/synthid.
  8. Twitter. https://twitter.com/.
  9. Introducing chatgpt. https://openai.com/blog/chatgpt.
  10. Wasserstein GAN. CoRR, abs/1701.07875, 2017.
  11. Identifying and mitigating the security risks of generative ai. CoRR, abs/2308.14840, 2023.
  12. From Text to MITRE Techniques: Exploring the Malicious Use of Large Language Models for Generating Cyber Attack Payloads. CoRR, abs/2305.15336, 2023.
  13. Large-scale visible watermark detection and removal with deep convolutional networks. In Proc. of the PRCV, pages 27–40, 2018.
  14. Diffusionshield: A watermark for copyright protection against generative diffusion models. CoRR, abs/2306.04642, 2023.
  15. Supervised GAN Watermarking for Intellectual Property Protection. In 2022 IEEE International Workshop on Information Forensics and Security (WIFS), pages 1–6, 2022.
  16. Watermarking images in self-supervised latent spaces. In Proc. of the ICASSP, pages 3054–3058, 2022.
  17. The Stable Signature: Rooting Watermarks in Latent Diffusion Models. CoRR, abs/2303.15435, 2023.
  18. The mass, fake news, and cognition security. Frontiers in Computer Science, 15(3):153806, 2021.
  19. Julian Hazell. Large Language Models Can Be Used To Effectively Scale Spear Phishing Campaigns. CoRR, abs/2305.06972, 2023.
  20. Deep Residual Learning for Image Recognition. In Proc. of the CVPR, pages 770–778, 2016.
  21. Gans trained by a two time-scale update rule converge to a local nash equilibrium. In Proc. of the NeurIPS, pages 6626–6637, 2017.
  22. Denoising Diffusion Probabilistic Models. In Proc. of the NeurIPS, 2020.
  23. Video Diffusion Models. In Proc. of the NeurIPS, 2022.
  24. Image quality metrics: Psnr vs. ssim. In Proc. of the ICPR, pages 2366–2369, 2010.
  25. A style-based generator architecture for generative adversarial networks. In Proc. of the CVPR, pages 4401–4410, 2019.
  26. Elucidating the design space of diffusion-based generative models. In Proc. of the NeurIPS, 2022.
  27. On the Reliability of Watermarks for Large Language Models. CoRR, abs/2306.04634, 2023.
  28. DiffWave: A Versatile Diffusion Model for Audio Synthesis. In Proc. of the ICLR, 2021.
  29. Anti-DreamBooth: Protecting users from personalized text-to-image synthesis. CoRR, abs/2303.15433, 2023.
  30. Xinyu Li. Diffwa: Diffusion models for watermark attack. CoRR, abs/2306.12790, 2023.
  31. Visible watermark removal via self-calibrated localization and background refinement. In Proc. of the MM, pages 4426–4434, 2021.
  32. Wdnet: Watermark-decomposition network for visible watermark removal. In Proc. of the WACV, pages 3684–3692, 2021.
  33. Prompt Injection attack against LLM-integrated Applications. CoRR, abs/2306.05499, 2023a.
  34. Watermarking Diffusion Model. CoRR, abs/2305.12502, 2023b.
  35. Deep Learning Face Attributes in the Wild. In Proc. of the ICCV, 2015.
  36. WAN: watermarking attack network. In Proc. of the BMVC, page 420, 2021.
  37. Diffusion models for adversarial purification. In Proc. of the ICML, pages 16805–16827, 2022.
  38. Visual Adversarial Examples Jailbreak Large Language Models. CoRR, abs/2306.13213, 2023.
  39. Learning transferable visual models from natural language supervision. In Proc. of the ICML, pages 8748–8763, 2021.
  40. High-Resolution Image Synthesis with Latent Diffusion Models. In Proc. of the CVPR, pages 10674–10685, 2022.
  41. Raising the Cost of Malicious AI-Powered Image Editing. CoRR, abs/2302.06588, 2023.
  42. Stegastamp: Invisible hyperlinks in physical photographs. In Proc. of the CVPR, pages 2114–2123, 2020.
  43. LLaMA: Open and Efficient Foundation Language Models. CoRR, abs/2302.13971, 2023.
  44. Deep image prior. In Proc. of the CVPR, pages 9446–9454, 2018.
  45. RD-IWAN: residual dense based imperceptible watermark attack network. IEEE Transactions on Circuits and Systems for Video Technology, 32(11):7460–7472, 2022.
  46. Jailbroken: How Does LLM Safety Training Fail? CoRR, abs/2307.02483, 2023.
  47. Tree-ring watermarks: Fingerprints for diffusion images that are invisible and robust. CoRR, abs/2305.20030, 2023.
  48. Wasserstein divergence for gans. In Proc. of the ECCV, pages 673–688, 2018.
  49. LSUN: construction of a large-scale image dataset using deep learning with humans in the loop. CoRR, abs/1506.03365, 2015.
  50. Robust invisible video watermarking with attention. CoRR, abs/1909.01285, 2019.
  51. The Unreasonable Effectiveness of Deep Features as a Perceptual Metric. In Proc. of the CVPR, pages 586–595, 2018.
  52. Invisible image watermarks are provably removable using generative ai. CoRR, abs/2306.01953, 2023a.
  53. A recipe for watermarking diffusion models. CoRR, abs/2303.10137, 2023b.
  54. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proc. of the ICCV, pages 2242–2251, 2017.
  55. Hidden: Hiding data with deep networks. In Proc. of the ECCV, pages 682–697, 2018.
Citations (2)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com
Youtube Logo Streamline Icon: https://streamlinehq.com