RAW: A Robust and Agile Plug-and-Play Watermark Framework for AI-Generated Images with Provable Guarantees (2403.18774v1)
Abstract: Safeguarding intellectual property and preventing potential misuse of AI-generated images are of paramount importance. This paper introduces a robust and agile plug-and-play watermark detection framework, dubbed as RAW. As a departure from traditional encoder-decoder methods, which incorporate fixed binary codes as watermarks within latent representations, our approach introduces learnable watermarks directly into the original image data. Subsequently, we employ a classifier that is jointly trained with the watermark to detect the presence of the watermark. The proposed framework is compatible with various generative architectures and supports on-the-fly watermark injection after training. By incorporating state-of-the-art smoothing techniques, we show that the framework provides provable guarantees regarding the false positive rate for misclassifying a watermarked image, even in the presence of certain adversarial attacks targeting watermark removal. Experiments on a diverse range of images generated by state-of-the-art diffusion models reveal substantial performance enhancements compared to existing approaches. For instance, our method demonstrates a notable increase in AUROC, from 0.48 to 0.82, when compared to state-of-the-art approaches in detecting watermarked images under adversarial attacks, while maintaining image quality, as indicated by closely aligned FID and CLIP scores.
- Turning your weakness into a strength: Watermarking deep neural networks by backdooring. In 27th USENIX Security Symposium (USENIX Security 18), pages 1615–1631, 2018.
- Optimal spread spectrum watermark embedding via a multistep feasibility formulation. IEEE transactions on image processing, 18(2):371–387, 2009.
- Variational image compression with a scale hyperprior. arXiv preprint arXiv:1802.01436, 2018.
- Compressai: a pytorch library and evaluation platform for end-to-end compression research. arXiv preprint arXiv:2011.03029, 2020.
- On the dangers of stochastic parrots: Can language models be too big? In Proceedings of the 2021 ACM conference on fairness, accountability, and transparency, pages 610–623, 2021.
- Pin-Yu Chen. Model reprogramming: Resource-efficient cross-domain machine learning. arXiv preprint arXiv:2202.10629, 2022.
- A simple framework for contrastive learning of visual representations. In International conference on machine learning, pages 1597–1607. PMLR, 2020.
- Learned image compression with discretized gaussian mixture likelihoods and attention modules. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 7939–7948, 2020.
- Certified adversarial robustness via randomized smoothing. In international conference on machine learning, pages 1310–1320. PMLR, 2019.
- Digital watermarking and steganography. Morgan kaufmann, 2007.
- Secure spread spectrum watermarking for images, audio and video. In Proceedings of 3rd IEEE international conference on image processing, pages 243–246. IEEE, 1996.
- Randomized smoothing for stochastic optimization. SIAM Journal on Optimization, 22(2):674–701, 2012.
- Watermarking images in self-supervised latent spaces. In ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 3054–3058. IEEE, 2022.
- The stable signature: Rooting watermarks in latent diffusion models. arXiv preprint arXiv:2303.15435, 2023.
- Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572, 2014.
- Badnets: Identifying vulnerabilities in the machine learning model supply chain. arXiv preprint arXiv:1708.06733, 2017.
- Review on video watermarking techniques in spatial and transform domain. In Information Systems Design and Intelligent Applications: Proceedings of Third International Conference INDIA 2016, Volume 2, pages 683–691. Springer, 2016.
- Watermarking of uncompressed and compressed video. Signal processing, 66(3):283–301, 1998.
- Generating steganographic images via adversarial training. Advances in neural information processing systems, 30, 2017.
- Dct-domain watermarking techniques for still images: Detector performance analysis and a new structure. IEEE transactions on image processing, 9(1):55–68, 2000.
- Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in neural information processing systems, 30, 2017.
- A new public watermarking algorithm for rgb color image based on quantization index modulation. In 2009 International Conference on Information and Automation, pages 837–841. IEEE, 2009.
- Discrete wavelet transform based video watermarking technique. In 2016 International Conference on Microelectronics, Computing and Communications (MicroCom), pages 1–6. IEEE, 2016.
- Exploring the learning capabilities of convolutional neural networks for robust image watermarking. Computers & Security, 65:247–268, 2017.
- Supervised contrastive learning. Advances in neural information processing systems, 33:18661–18673, 2020.
- Wouaf: Weight modulation for user attribution and fingerprinting in text-to-image diffusion models. arXiv preprint arXiv:2306.04744, 2023.
- Transform domain video watermarking: Design, implementation and performance analysis. In 2012 International Conference on Communication Systems and Network Technologies, pages 133–137. IEEE, 2012.
- Adversarial frontier stitching for remote neural network watermarking. Neural Computing and Applications, 32:9233–9244, 2020.
- Distribution-free prediction bands for non-parametric regression. Journal of the Royal Statistical Society Series B: Statistical Methodology, 76(1):71–96, 2014.
- Spread-transform dither modulation watermarking of deep neural network. Journal of Information Security and Applications, 63:103004, 2021.
- Microsoft coco: Common objects in context. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13, pages 740–755. Springer, 2014.
- Probabilistic margins for instance reweighting in adversarial training. Advances in Neural Information Processing Systems, 34:23258–23269, 2021.
- Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083, 2017.
- When does label smoothing help? Advances in neural information processing systems, 32, 2019.
- Rotation, scale and translation invariant digital image watermarking. In Proceedings of International Conference on Image Processing, pages 536–539. IEEE, 1997.
- Template based recovery of fourier-based watermarks using log-polar and log-log maps. In Proceedings IEEE international conference on multimedia computing and systems, pages 870–874. IEEE, 1999.
- Ioannis Pitas. A method for watermark casting on digital image. IEEE Transactions on Circuits and Systems for Video Technology, 8(6):775–780, 1998.
- Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR, 2021.
- Hierarchical text-conditional image generation with clip latents, 2022. URL https://arxiv. org/abs/2204.06125, 7, 2022.
- High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10684–10695, 2022.
- Matthew Sag. Copyright safety for generative ai. Forthcoming in the Houston Law Review, 2023.
- Provably robust deep learning via adversarially trained smoothed classifiers. Advances in Neural Information Processing Systems, 32, 2019.
- Denoising diffusion implicit models. arXiv preprint arXiv:2010.02502, 2020.
- Stealing machine learning models via prediction {{\{{APIs}}\}}. In 25th USENIX security symposium (USENIX Security 16), pages 601–618, 2016.
- Embedding watermarks into deep neural networks. In Proceedings of the 2017 ACM on international conference on multimedia retrieval, pages 269–277, 2017.
- Vladimir N Vapnik. An overview of statistical learning theory. IEEE transactions on neural networks, 10(5):988–999, 1999.
- Luisa Verdoliva. Media forensics and deepfakes: an overview. IEEE Journal of Selected Topics in Signal Processing, 14(5):910–932, 2020.
- Algorithmic learning in a random world. Springer, 2005.
- Watermarking for out-of-distribution detection. Advances in Neural Information Processing Systems, 35:15545–15557, 2022a.
- Diffusiondb: A large-scale prompt gallery dataset for text-to-image generative models. arXiv preprint arXiv:2210.14896, 2022b.
- Tree-ring watermarks: Fingerprints for diffusion images that are invisible and robust. arXiv preprint arXiv:2305.20030, 2023.
- Wikipedia contributors. Stable diffusion, 2023.
- Understanding model extraction games. In 2022 IEEE 4th International Conference on Trust, Privacy and Security in Intelligent Systems, and Applications (TPS-ISA), pages 285–294. IEEE, 2022.
- Understanding backdoor attacks through the adaptability hypothesis. In Conference on Neural Information Processing Systems (NeurIPS), 2023.
- Protecting intellectual property of deep neural networks with watermarking. In Proceedings of the 2018 on Asia conference on computer and communications security, pages 159–172, 2018.
- Robust invisible video watermarking with attention. arXiv preprint arXiv:1909.01285, 2019.
- A recipe for watermarking diffusion models. arXiv preprint arXiv:2303.10137, 2023.
- Hidden: Hiding data with deep networks. In Proceedings of the European conference on computer vision (ECCV), pages 657–672, 2018.