Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

AquaLoRA: Toward White-box Protection for Customized Stable Diffusion Models via Watermark LoRA (2405.11135v1)

Published 18 May 2024 in cs.CR

Abstract: Diffusion models have achieved remarkable success in generating high-quality images. Recently, the open-source models represented by Stable Diffusion (SD) are thriving and are accessible for customization, giving rise to a vibrant community of creators and enthusiasts. However, the widespread availability of customized SD models has led to copyright concerns, like unauthorized model distribution and unconsented commercial use. To address it, recent works aim to let SD models output watermarked content for post-hoc forensics. Unfortunately, none of them can achieve the challenging white-box protection, wherein the malicious user can easily remove or replace the watermarking module to fail the subsequent verification. For this, we propose \texttt{\method} as the first implementation under this scenario. Briefly, we merge watermark information into the U-Net of Stable Diffusion Models via a watermark Low-Rank Adaptation (LoRA) module in a two-stage manner. For watermark LoRA module, we devise a scaling matrix to achieve flexible message updates without retraining. To guarantee fidelity, we design Prior Preserving Fine-Tuning (PPFT) to ensure watermark learning with minimal impacts on model distribution, validated by proofs. Finally, we conduct extensive experiments and ablation studies to verify our design.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (48)
  1. Improving image generation with better captions. 2023. URL https://api.semanticscholar.org/CorpusID:264403242.
  2. bmaltais. https://github.com/bmaltais/kohya_ss.
  3. Rosteals: Robust steganography using autoencoder latent space. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  933–942, 2023.
  4. Emerging properties in self-supervised vision transformers. In Proceedings of the IEEE/CVF international conference on computer vision, pp.  9650–9660, 2021.
  5. Civitai. https://civitai.com/.
  6. ClearVAE. https://civitai.com/models/22354/clearvae.
  7. Diffusionshield: A watermark for copyright protection against generative diffusion models. arXiv preprint arXiv:2306.04642, 2023.
  8. The stable signature: Rooting watermarks in latent diffusion models. arXiv preprint arXiv:2303.15435, 2023.
  9. Dreamsim: Learning new dimensions of human visual similarity using synthetic data. arXiv preprint arXiv:2306.09344, 2023.
  10. Decoding fingerprinting using the markov chain monte carlo method. In WIFS-IEEE Workshop on Information Forensics and Security. IEEE, 2012.
  11. An image is worth one word: Personalizing text-to-image generation using textual inversion. arXiv preprint arXiv:2208.01618, 2022.
  12. Gustavosta. https://huggingface.co/datasets/Gustavosta/Stable-Diffusion-Prompts.
  13. Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in neural information processing systems, 30, 2017.
  14. Classifier-free diffusion guidance. arXiv preprint arXiv:2207.12598, 2022.
  15. Denoising diffusion probabilistic models. Advances in neural information processing systems, 33:6840–6851, 2020.
  16. Lora: Low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685, 2021.
  17. Elucidating the design space of diffusion-based generative models. Advances in Neural Information Processing Systems, 35:26565–26577, 2022.
  18. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013.
  19. Microsoft coco: Common objects in context. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13, pp.  740–755. Springer, 2014.
  20. Dpm-solver: A fast ode solver for diffusion probabilistic model sampling in around 10 steps. Advances in Neural Information Processing Systems, 35:5775–5787, 2022.
  21. Lcm-lora: A universal stable-diffusion acceleration module. arXiv preprint arXiv:2311.05556, 2023.
  22. Toward practical joint decoding of binary tardos fingerprinting codes. IEEE Transactions on Information Forensics and Security, 7(4):1168–1180, 2012.
  23. official repository, S. D. https://github.com/CompVis/stable-diffusion.
  24. Patreon. https://www.patreon.com/aitrepreneur/posts.
  25. prompthero. https://prompthero.com/.
  26. Learning transferable visual models from natural language supervision. In International conference on machine learning, pp.  8748–8763. PMLR, 2021.
  27. Rahman, M. M. A dwt, dct and svd based watermarking technique to protect the image piracy. International Journal of Managing Public Sector Information and Communication Technologies, 4(2):21, 2013.
  28. Hierarchical text-conditional image generation with clip latents. arXiv preprint arXiv:2204.06125, 1(2):3, 2022.
  29. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp.  10684–10695, 2022.
  30. Dreambooth: Fine tuning text-to-image diffusion models for subject-driven generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  22500–22510, 2023.
  31. Photorealistic text-to-image diffusion models with deep language understanding. Advances in Neural Information Processing Systems, 35:36479–36494, 2022.
  32. sd-vae-ft mse. https://huggingface.co/stabilityai/sd-vae-ft-mse.
  33. Denoising diffusion implicit models. In International Conference on Learning Representations, 2020.
  34. Efficientnet: Rethinking model scaling for convolutional neural networks. In International conference on machine learning, pp.  6105–6114. PMLR, 2019.
  35. Tree-ring watermarks: Fingerprints for diffusion images that are invisible and robust. arXiv preprint arXiv:2305.20030, 2023.
  36. Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time. In International conference on machine learning, pp.  23965–23998. PMLR, 2022.
  37. Flexible and secure watermarking for latent diffusion model. In Proceedings of the 31st ACM International Conference on Multimedia, pp.  1668–1676, 2023.
  38. Scaling autoregressive models for content-rich text-to-image generation. Transactions on Machine Learning Research, 2022.
  39. Artificial fingerprinting for generative models: Rooting deepfake attribution in training data. In Proceedings of the IEEE/CVF International conference on computer vision, pp.  14448–14457, 2021.
  40. Florence: A new foundation model for computer vision. arXiv preprint arXiv:2111.11432, 2021.
  41. Robust invisible video watermarking with attention. 2019.
  42. Adding conditional control to text-to-image diffusion models. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp.  3836–3847, 2023.
  43. The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.  586–595, 2018.
  44. Unipc: A unified predictor-corrector framework for fast sampling of diffusion models. arXiv preprint arXiv:2302.04867, 2023a.
  45. Invisible image watermarks are provably removable using generative ai. Saastha Vasan, Ilya Grishchenko, Christopher Kruegel, Giovanni Vigna, Yu-Xiang Wang, and Lei Li,“Invisible image watermarks are provably removable using generative ai,” Aug, 2023b.
  46. Generative autoencoders as watermark attackers: Analyses of vulnerabilities and threats. arXiv preprint arXiv:2306.01953, 2023c.
  47. A recipe for watermarking diffusion models. arXiv preprint arXiv:2303.10137, 2023d.
  48. Hidden: Hiding data with deep networks. In Proceedings of the European conference on computer vision (ECCV), pp.  657–672, 2018.
Citations (7)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com