Personalization as a Shortcut for Few-Shot Backdoor Attack against Text-to-Image Diffusion Models (2305.10701v3)
Abstract: Although recent personalization methods have democratized high-resolution image synthesis by enabling swift concept acquisition with minimal examples and lightweight computation, they also present an exploitable avenue for high accessible backdoor attacks. This paper investigates a critical and unexplored aspect of text-to-image (T2I) diffusion models - their potential vulnerability to backdoor attacks via personalization. Our study focuses on a zero-day backdoor vulnerability prevalent in two families of personalization methods, epitomized by Textual Inversion and DreamBooth.Compared to traditional backdoor attacks, our proposed method can facilitate more precise, efficient, and easily accessible attacks with a lower barrier to entry. We provide a comprehensive review of personalization in T2I diffusion models, highlighting the operation and exploitation potential of this backdoor vulnerability. To be specific, by studying the prompt processing of Textual Inversion and DreamBooth, we have devised dedicated backdoor attacks according to the different ways of dealing with unseen tokens and analyzed the influence of triggers and concept images on the attack effect. Through comprehensive empirical study, we endorse the utilization of the nouveau-token backdoor attack due to its impressive effectiveness, stealthiness, and integrity, markedly outperforming the legacy-token backdoor attack.
- J. Ho, A. Jain, and P. Abbeel, “Denoising diffusion probabilistic models,” Advances in Neural Information Processing Systems, vol. 33, pp. 6840–6851, 2020.
- C. Schuhmann, R. Beaumont, R. Vencu, C. Gordon, R. Wightman, M. Cherti, T. Coombes, A. Katta, C. Mullis, M. Wortsman et al., “Laion-5b: An open large-scale dataset for training next generation image-text models,” arXiv preprint arXiv:2210.08402, 2022.
- Stability AI, “Stable diffusion version 2,” 2023, accessed: 2023-05-01. [Online]. Available: https://github.com/Stability-AI/stablediffusion
- R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, “High-resolution image synthesis with latent diffusion models,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 10 684–10 695.
- R. Gal, Y. Alaluf, Y. Atzmon, Patashnik, A. H. Bermano, G. Chechik, and D. Cohen-or, “An image is worth one word: Personalizing text-to-image generation using textual inversion,” in The Eleventh International Conference on Learning Representations, 2023. [Online]. Available: https://openreview.net/forum?id=NAQvF08TcyG
- N. Ruiz, Y. Li, V. Jampani, Y. Pritch, M. Rubinstein, and K. Aberman, “Dreambooth: Fine tuning text-to-image diffusion models for subject-driven generation,” arXiv preprint arXiv:2208.12242, 2022.
- E. J. Hu, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wang, L. Wang, and W. Chen, “Lora: Low-rank adaptation of large language models,” arXiv preprint arXiv:2106.09685, 2021.
- S. Ryu, “Low-rank adaptation for fast text-to-image diffusion fine-tuning,” 2023, accessed: 2023-05-01. [Online]. Available: https://github.com/cloneofsimo/lora
- R. Gal, M. Arar, Y. Atzmon, A. H. Bermano, G. Chechik, and D. Cohen-Or, “Designing an encoder for fast personalization of text-to-image models,” arXiv preprint arXiv:2302.12228, 2023.
- L. Han, Y. Li, H. Zhang, P. Milanfar, D. Metaxas, and F. Yang, “Svdiff: Compact parameter space for diffusion fine-tuning,” arXiv preprint arXiv:2303.11305, 2023.
- J. Shi, W. Xiong, Z. Lin, and H. J. Jung, “Instantbooth: Personalized text-to-image generation without test-time finetuning,” arXiv preprint arXiv:2304.03411, 2023.
- Y. Tewel, R. Gal, G. Chechik, and Y. Atzmon, “Key-locked rank one editing for text-to-image personalization,” arXiv preprint arXiv:2305.01644, 2023.
- C. Zhang, C. Zhang, M. Zhang, and I. S. Kweon, “Text-to-image diffusion model in generative ai: A survey,” arXiv preprint arXiv:2303.07909, 2023.
- F.-A. Croitoru, V. Hondru, R. T. Ionescu, and M. Shah, “Diffusion models in vision: A survey,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023.
- G. Daras and A. G. Dimakis, “Multiresolution textual inversion,” arXiv preprint arXiv:2211.17115, 2022.
- Y. Li, L. Zhu, X. Jia, Y. Bai, Y. Jiang, S.-T. Xia, and X. Cao, “Move: Effective and harmless ownership verification via embedded external features,” arXiv preprint arXiv:2208.02820, 2022.
- Y. Li, L. Zhu, X. Jia, Y. Jiang, S.-T. Xia, and X. Cao, “Defending against model stealing via verifying embedded external features,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, no. 2, 2022, pp. 1464–1472.
- X. Liu, J. Liu, Y. Bai, J. Gu, T. Chen, X. Jia, and X. Cao, “Watermark vaccine: Adversarial attacks to prevent watermark removal,” in European Conference on Computer Vision. Springer, 2022, pp. 1–17.
- S. Zhao, K. Chen, M. Hao, J. Zhang, G. Xu, H. Li, and T. Zhang, “Extracting cloud-based model with prior knowledge,” arXiv preprint arXiv:2306.04192, 2023.
- Y. Li, Y. Jiang, Z. Li, and S.-T. Xia, “Backdoor learning: A survey,” IEEE Transactions on Neural Networks and Learning Systems, 2022.
- Y. Huang, L. Sun, Q. Guo, F. Juefei-Xu, J. Zhu, J. Feng, Y. Liu, and G. Pu, “Ala: Naturalness-aware adversarial lightness attack,” in Proceedings of the 31st ACM International Conference on Multimedia, 2023, pp. 2418–2426.
- T. Li, A. Liu, X. Liu, Y. Xu, C. Zhang, and X. Xie, “Understanding adversarial robustness via critical attacking route,” Information Sciences, vol. 547, pp. 568–578, 2021.
- Y. Huang, Q. Guo, F. Juefei-Xu, L. Ma, W. Miao, Y. Liu, and G. Pu, “Advfilter: predictive perturbation-aware filtering against adversarial attack via multi-domain learning,” in Proceedings of the 29th ACM International Conference on Multimedia, 2021, pp. 395–403.
- C. Zhang, A. Liu, X. Liu, Y. Xu, H. Yu, Y. Ma, and T. Li, “Interpreting and improving adversarial robustness of deep neural networks with neuron sensitivity,” IEEE Transactions on Image Processing, vol. 30, pp. 1291–1304, 2020.
- Y. Huang, F. Juefei-Xu, Q. Guo, W. Miao, Y. Liu, and G. Pu, “Advbokeh: Learning to adversarially defocus blur,” arXiv preprint arXiv:2111.12971, 2021.
- T. Gu, K. Liu, B. Dolan-Gavitt, and S. Garg, “Badnets: Evaluating backdooring attacks on deep neural networks,” IEEE Access, vol. 7, pp. 47 230–47 244, 2019.
- S. Li, T. Dong, B. Z. H. Zhao, M. Xue, S. Du, and H. Zhu, “Backdoors against natural language processing: A review,” IEEE Security & Privacy, vol. 20, no. 05, pp. 50–59, 2022.
- M. Walmer, K. Sikka, I. Sur, A. Shrivastava, and S. Jha, “Dual-key multimodal backdoors for visual question answering,” in Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, 2022, pp. 15 375–15 385.
- L. Wang, Z. Javed, X. Wu, W. Guo, X. Xing, and D. Song, “Backdoorl: Backdoor attack against competitive reinforcement learning,” arXiv preprint arXiv:2105.00579, 2021.
- M. Goldblum, D. Tsipras, C. Xie, X. Chen, A. Schwarzschild, D. Song, A. Madry, B. Li, and T. Goldstein, “Dataset security for machine learning: Data poisoning, backdoor attacks, and defenses,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 2, pp. 1563–1580, 2022.
- L. Struppek, D. Hintersdorf, and K. Kersting, “Rickrolling the artist: Injecting invisible backdoors into text-guided image generation models,” arXiv preprint arXiv:2211.02408, 2022.
- S. Zhai, Y. Dong, Q. Shen, S. Pu, Y. Fang, and H. Su, “Text-to-image diffusion models can be easily backdoored through multimodal data poisoning,” arXiv preprint arXiv:2305.04175, 2023.
- X. Chen, C. Liu, B. Li, K. Lu, and D. Song, “Targeted backdoor attacks on deep learning systems using data poisoning,” arXiv preprint arXiv:1712.05526, 2017.
- Y. Li, Y. Li, B. Wu, L. Li, R. He, and S. Lyu, “Invisible backdoor attack with sample-specific triggers,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 16 463–16 472.
- W. Yang, Y. Lin, P. Li, J. Zhou, and X. Sun, “Rethinking stealthiness of backdoor attack against nlp models,” in Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 2021, pp. 5543–5557.
- H. Face, “Code of textual inversion,” https://huggingface.co/docs/diffusers/training/text_inversion, 2022.
- ——, “Code of dreambooth,” https://huggingface.co/docs/diffusers/training/dreambooth, 2023.
- A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark et al., “Learning transferable visual models from natural language supervision,” in International conference on machine learning. PMLR, 2021, pp. 8748–8763.
- Openai, “Code of clip,” https://github.com/openai/CLIP, 2021.
- G. Parmar, R. Zhang, and J.-Y. Zhu, “On aliased resizing and surprising subtleties in gan evaluation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 11 410–11 420.
- H. Face, “Data of dreambooth,” https://github.com/google/dreambooth, 2023.
- Y. Yang, M. Hu, Y. Cao, J. Xia, Y. Huang, Y. Liu, and M. Chen, “Protect federated learning against backdoor attacks via data-free trigger generation,” arXiv preprint arXiv:2308.11333, 2023.
- X. Zhang, C. Zhang, T. Li, Y. Huang, X. Jia, X. Xie, Y. Liu, and C. Shen, “A mutation-based method for multi-modal jailbreaking attack detection,” 2023.
- Y. Huang, F. Juefei-Xu, Q. Guo, Y. Liu, and G. Pu, “Dodging deepfake detection via implicit spatial-domain notch filtering,” IEEE Transactions on Circuits and Systems for Video Technology, 2023.
- Y. Huang, F. Juefei-Xu, R. Wang, Q. Guo, L. Ma, X. Xie, J. Li, W. Miao, Y. Liu, and G. Pu, “Fakepolisher: Making deepfakes more detection-evasive by shallow reconstruction,” in Proceedings of the 28th ACM international conference on multimedia, 2020, pp. 1217–1226.
- Y. Huang, F. Juefei-Xu, Q. Guo, Y. Liu, and G. Pu, “Fakelocator: Robust localization of gan-based face manipulations,” IEEE Transactions on Information Forensics and Security, vol. 17, pp. 2657–2672, 2022.
- Y. Hou, Q. Guo, Y. Huang, X. Xie, L. Ma, and J. Zhao, “Evading deepfake detectors via adversarial statistical consistency,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 12 271–12 280.
- R. Wang, F. Juefei-Xu, L. Ma, X. Xie, Y. Huang, J. Wang, and Y. Liu, “Fakespotter: A simple yet robust baseline for spotting ai-synthesized fake faces,” in Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI-20, C. Bessiere, Ed. International Joint Conferences on Artificial Intelligence Organization, 7 2020, pp. 3444–3451, main track. [Online]. Available: https://doi.org/10.24963/ijcai.2020/476
- T. Li, Q. Guo, A. Liu, M. Du, Z. Li, and Y. Liu, “Fairer: fairness as decision rationale alignment,” in International Conference on Machine Learning. PMLR, 2023, pp. 19 471–19 489.
- T. Li, Z. Li, A. Li, M. Du, A. Liu, Q. Guo, G. Meng, and Y. Liu, “Fairness via group contribution matching,” in Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023, pp. 436–445.
- Y. Adi, C. Baum, M. Cisse, B. Pinkas, and J. Keshet, “Turning your weakness into a strength: Watermarking deep neural networks by backdooring,” in 27th USENIX Security Symposium (USENIX Security 18), 2018, pp. 1615–1631.
- D. S. Ong, C. S. Chan, K. W. Ng, L. Fan, and Q. Yang, “Protecting intellectual property of generative adversarial networks from ambiguity attacks,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 3630–3639.
- Yihao Huang (51 papers)
- Felix Juefei-Xu (93 papers)
- Qing Guo (146 papers)
- Jie Zhang (846 papers)
- Yutong Wu (25 papers)
- Ming Hu (110 papers)
- Tianlin Li (43 papers)
- Geguang Pu (48 papers)
- Yang Liu (2253 papers)