MetaCloak: Preventing Unauthorized Subject-driven Text-to-image Diffusion-based Synthesis via Meta-learning
Abstract: Text-to-image diffusion models allow seamless generation of personalized images from scant reference photos. Yet, these tools, in the wrong hands, can fabricate misleading or harmful content, endangering individuals. To address this problem, existing poisoning-based approaches perturb user images in an imperceptible way to render them "unlearnable" from malicious uses. We identify two limitations of these defending approaches: i) sub-optimal due to the hand-crafted heuristics for solving the intractable bilevel optimization and ii) lack of robustness against simple data transformations like Gaussian filtering. To solve these challenges, we propose MetaCloak, which solves the bi-level poisoning problem with a meta-learning framework with an additional transformation sampling process to craft transferable and robust perturbation. Specifically, we employ a pool of surrogate diffusion models to craft transferable and model-agnostic perturbation. Furthermore, by incorporating an additional transformation process, we design a simple denoising-error maximization loss that is sufficient for causing transformation-robust semantic distortion and degradation in a personalized generation. Extensive experiments on the VGGFace2 and CelebA-HQ datasets show that MetaCloak outperforms existing approaches. Notably, MetaCloak can successfully fool online training services like Replicate, in a black-box manner, demonstrating the effectiveness of MetaCloak in real-world scenarios. Our code is available at https://github.com/liuyixin-louis/MetaCloak.
- Synthesizing robust adversarial examples, 2018.
- Visual prompting via image inpainting. Advances in Neural Information Processing Systems, 35:25005–25017, 2022.
- Igor Bonifacic. Deepfake fraud attempts are up 3000 https://thenextweb.com/news/deepfake-fraud-rise-amid-cheap-generative-ai-boom, 2023. [Accessed: 16-Nov-2023].
- Impress: Evaluating the resilience of imperceptible perturbations against unauthorized data usage in diffusion-based generative ai. Advances in Neural Information Processing Systems, 36, 2024.
- Vggface2: A dataset for recognising faces across pose and age. In 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), pages 67–74. IEEE, 2018.
- Extracting training data from diffusion models. In 32nd USENIX Security Symposium (USENIX Security 23), pages 5253–5270, 2023.
- Custom-edit: Text-guided image editing with customized diffusion models, 2023.
- Retinaface: Single-shot multi-level face localisation in the wild. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5203–5212, 2020.
- Diffusion models beat gans on image synthesis. Advances in Neural Information Processing Systems, 34:8780–8794, 2021.
- Towards generalizable data protection with transferable unlearnable examples. arXiv preprint arXiv:2305.11191, 2023.
- Learning to confuse: generating training time adversarial data with auto-encoder. Advances in Neural Information Processing Systems, 32, 2019.
- Model-agnostic meta-learning for fast adaptation of deep networks. In International conference on machine learning, pages 1126–1135. PMLR, 2017.
- Robust unlearnable examples: Protecting data privacy against adversarial learning. In International Conference on Learning Representations, 2021.
- An image is worth one word: Personalizing text-to-image generation using textual inversion, 2022.
- Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572, 2014.
- Towards fast and accurate real-world depth super-resolution: Benchmark dataset and baseline, 2021.
- Denoising diffusion probabilistic models. In Advances in neural information processing systems, 2020.
- Unlearnable examples: Making personal data unexploitable. arXiv preprint arXiv:2101.04898, 2021.
- Metapoison: Practical general-purpose clean-label data poisoning. Advances in Neural Information Processing Systems, 33:12080–12091, 2020.
- Unlearnable examples: Protecting open-source software from unauthorized neural code learning. In SEKE, pages 525–530, 2022.
- Kevin Jiang. These ai images look just like me. what does that mean for the future of deepfakes? Toronto Star.
- Progressive growing of gans for improved quality, stability, and variation. arXiv preprint arXiv:1710.10196, 2017.
- Diffusionclip: Text-guided diffusion models for robust image manipulation, 2022.
- Adam: A method for stochastic optimization, 2017.
- Functional adversarial attacks, 2019.
- Anti-dreambooth: Protecting users from personalized text-to-image synthesis. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023.
- Mist: Towards improved adversarial examples for diffusion models. arXiv preprint arXiv:2305.12683, 2023.
- Adversarial example does good: preventing painting imitation from diffusion models via adversarial examples. In Proceedings of the 40th International Conference on Machine Learning, pages 20763–20786, 2023.
- Graphcloak: Safeguarding task-specific knowledge within graph-structured data from unauthorized exploitation. arXiv preprint arXiv:2310.07100, 2023.
- Feature distillation: Dnn-oriented jpeg compression against adversarial examples, 2019.
- Towards deep learning models resistant to adversarial attacks. In International Conference on Learning Representations, 2018.
- No-reference image quality assessment in the spatial domain. IEEE Transactions on Image Processing, 21(12):4695–4708, 2012.
- Image super-resolution as a defense against adversarial attacks. IEEE Transactions on Image Processing, 29:1711–1724, 2019.
- Diffusion models for adversarial purification. arXiv preprint arXiv:2205.07460, 2022.
- NPR. It takes a few dollars and 8 minutes to create a deepfake. and that’s only the start. https://www.npr.org/2023/03/23/1165146797/it-takes-a-few-dollars-and-8-minutes-to-create-a-deepfake-and-thats-only-the-sta, 2023. [Accessed: 16-Nov-2023].
- Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR, 2021.
- Dreambooth3d: Subject-driven text-to-3d generation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 2349–2359, 2023.
- Hierarchical text-conditional image generation with clip latents. arXiv preprint arXiv:2204.06125, 2022.
- Transferable unlearnable examples. arXiv preprint arXiv:2210.10114, 2022.
- Replicate. Replicate, 2023.
- High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 10684–10695, 2022a.
- High-resolution image synthesis with latent diffusion models, 2022b.
- Dreambooth: Fine tuning text-to-image diffusion models for subject-driven generation. 2022.
- Dreambooth: Fine tuning text-to-image diffusion models for subject-driven generation, 2023.
- Cuda: Convolution-based unlearnable datasets. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3862–3871, 2023.
- Photorealistic text-to-image diffusion models with deep language understanding. In Advances in Neural Information Processing Systems, 2022.
- Raising the cost of malicious ai-powered image editing. In International Conference on Machine Learning, pages 29894–29918. PMLR, 2023.
- Hyperextended lightface: A facial attribute analysis framework. In 2021 International Conference on Engineering and Emerging Technologies (ICEET), pages 1–4. IEEE, 2021.
- Glaze: Protecting artists from style mimicry by {{\{{Text-to-Image}}\}} models. In 32nd USENIX Security Symposium (USENIX Security 23), pages 2187–2204, 2023.
- Dragdiffusion: Harnessing diffusion models for interactive point-based image editing, 2023.
- Score-based generative modeling through stochastic differential equations, 2021.
- Better safe than sorry: Preventing delusive adversaries with adversarial training. Advances in Neural Information Processing Systems, 34:16209–16225, 2021.
- Diffusers: State-of-the-art diffusion models. https://github.com/huggingface/diffusers, 2022.
- Provable copyright protection for generative models. arXiv preprint arXiv:2302.10870, 2023.
- Adversarial defense via data dependent activation function and total variation minimization, 2020.
- Exploring clip for assessing the look and feel of images. In AAAI, 2023.
- Availability attacks create shortcuts. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 2367–2376, 2022.
- Blind image quality assessment via vision-language correspondence: A multitask learning perspective. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 14071–14081, 2023.
- Understanding and improving adversarial attacks on latent diffusion model. arXiv preprint arXiv:2310.04687, 2023.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.