PiGW: A Plug-in Generative Watermarking Framework (2403.12053v2)
Abstract: Integrating watermarks into generative images is a critical strategy for protecting intellectual property and enhancing artificial intelligence security. This paper proposes Plug-in Generative Watermarking (PiGW) as a general framework for integrating watermarks into generative images. More specifically, PiGW embeds watermark information into the initial noise using a learnable watermark embedding network and an adaptive frequency spectrum mask. Furthermore, it optimizes training costs by gradually increasing timesteps. Extensive experiments demonstrate that PiGW enables embedding watermarks into the generated image with negligible quality loss while achieving true invisibility and high resistance to noise attacks. Moreover, PiGW can serve as a plugin for various commonly used generative structures and multimodal generative content types. Finally, we demonstrate how PiGW can also be utilized for detecting generated images, contributing to the promotion of secure AI development. The project code will be made available on GitHub.
- Text-image watermarking based on integer wavelet transform (iwt) and discrete cosine transform (dct). Applied Computing and Informatics, 15(2):191–202, 2019.
- Rosteals: Robust steganography using autoencoder latent space. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 933–942, 2023.
- What makes fake images detectable? understanding properties that generalize. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXVI 16, pages 103–120. Springer, 2020.
- Reproducible scaling laws for contrastive language-image learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2818–2829, 2023.
- Digital watermarking and steganography. Morgan kaufmann, 2007.
- Generative adversarial networks: An overview. IEEE signal processing magazine, 35(1):53–65, 2018.
- Diffusion models beat gans on image synthesis. Advances in neural information processing systems, 34:8780–8794, 2021.
- Flow-based robust watermarking with invertible noise layer for black-box distortions. In Proceedings of the AAAI conference on artificial intelligence, pages 5054–5061, 2023.
- Supervised gan watermarking for intellectual property protection. In 2022 IEEE International Workshop on Information Forensics and Security (WIFS), pages 1–6. IEEE, 2022.
- The stable signature: Rooting watermarks in latent diffusion models. arXiv preprint arXiv:2303.15435, 2023.
- Generative adversarial nets. Advances in neural information processing systems, 27, 2014.
- Generative adversarial networks. Communications of the ACM, 63(11):139–144, 2020.
- Hybrid blind robust image watermarking technique based on dft-dct and arnold transform. Multimedia Tools and Applications, 77:27181–27214, 2018.
- Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in neural information processing systems, 30, 2017.
- Denoising diffusion probabilistic models. Advances in neural information processing systems, 33:6840–6851, 2020.
- Cascaded diffusion models for high fidelity image generation. The Journal of Machine Learning Research, 23(1):2249–2281, 2022.
- Lora: Low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685, 2021.
- Orthogonal moments based on exponent functions: Exponent-fourier moments. Pattern Recognition, 47(8):2596–2606, 2014.
- Ming-Kuei Hu. Visual pattern recognition by moment invariants. IRE transactions on information theory, 8(2):179–187, 1962.
- Mbrs: Enhancing robustness of dnn-based watermarking by mini-batch of real and simulated jpeg compression. In Proceedings of the 29th ACM International Conference on Multimedia, pages 41–49, 2021a.
- Mbrs: Enhancing robustness of dnn-based watermarking by mini-batch of real and simulated jpeg compression. In Proceedings of the 29th ACM international conference on multimedia, pages 41–49, 2021b.
- Shap-e: Generating conditional 3d implicit functions. arXiv preprint arXiv:2305.02463, 2023.
- Scaling up gans for text-to-image synthesis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10124–10134, 2023.
- A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4401–4410, 2019.
- Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013.
- Cycleganwm: A cyclegan watermarking method for ownership verification. arXiv preprint arXiv:2211.13737, 2022.
- Microsoft coco: Common objects in context. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13, pages 740–755. Springer, 2014.
- Audioldm: Text-to-audio generation with latent diffusion models. arXiv preprint arXiv:2301.12503, 2023a.
- A novel two-stage separable deep learning framework for practical blind watermarking. In Proceedings of the 27th ACM International conference on multimedia, pages 1509–1517, 2019.
- Watermarking diffusion model. arXiv preprint arXiv:2305.12502, 2023b.
- Towards blind watermarking: Combining invertible and non-invertible mechanisms. In Proceedings of the 30th ACM International Conference on Multimedia, pages 1532–1542, 2022.
- Generative watermarking against unauthorized subject-driven image synthesis. arXiv preprint arXiv:2306.07754, 2023.
- Lwt-qr decomposition based robust and efficient image watermarking scheme using lagrangian svr. Multimedia Tools and Applications, 75:4129–4150, 2016.
- Glide: Towards photorealistic image generation and editing with text-guided diffusion models. arXiv preprint arXiv:2112.10741, 2021.
- Protecting the intellectual property of diffusion models by the watermark diffusion process. arXiv preprint arXiv:2306.03436, 2023.
- Sdxl: Improving latent diffusion models for high-resolution image synthesis. arXiv preprint arXiv:2307.01952, 2023.
- Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR, 2021.
- Zero-shot text-to-image generation. In International Conference on Machine Learning, pages 8821–8831. PMLR, 2021.
- Hierarchical text-conditional image generation with clip latents. arXiv preprint arXiv:2204.06125, 2022.
- High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10684–10695, 2022.
- Dreambooth: Fine tuning text-to-image diffusion models for subject-driven generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 22500–22510, 2023.
- Palette: Image-to-image diffusion models. In ACM SIGGRAPH 2022 Conference Proceedings, pages 1–10, 2022a.
- Photorealistic text-to-image diffusion models with deep language understanding. Advances in Neural Information Processing Systems, 35:36479–36494, 2022b.
- Image super-resolution via iterative refinement. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(4):4713–4726, 2022c.
- Deep unsupervised learning using nonequilibrium thermodynamics. In International conference on machine learning, pages 2256–2265. PMLR, 2015.
- Generative modeling by estimating gradients of the data distribution. Advances in neural information processing systems, 32, 2019.
- Improved techniques for training score-based generative models. Advances in neural information processing systems, 33:12438–12448, 2020.
- Schur and dct decomposition based medical images watermarking. In 2018 Sixth International Conference on Enterprise Systems (ES), pages 204–210. IEEE, 2018.
- A blind double color image watermarking algorithm based on qr decomposition. Multimedia tools and applications, 72:987–1009, 2014.
- Dire for diffusion-generated image detection. arXiv preprint arXiv:2303.09295, 2023.
- Tree-ring watermarks: Fingerprints for diffusion images that are invisible and robust. arXiv preprint arXiv:2305.20030, 2023.
- Deblurring via stochastic refinement. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16293–16303, 2022.
- Robust invisible video watermarking with attention. arXiv preprint arXiv:1909.01285, 2019.
- Hidden: Hiding data with deep networks. In Proceedings of the European conference on computer vision (ECCV), pages 657–672, 2018.