Disrupting Style Mimicry Attacks on Video Imagery (2405.06865v1)
Abstract: Generative AI models are often used to perform mimicry attacks, where a pretrained model is fine-tuned on a small sample of images to learn to mimic a specific artist of interest. While researchers have introduced multiple anti-mimicry protection tools (Mist, Glaze, Anti-Dreambooth), recent evidence points to a growing trend of mimicry models using videos as sources of training data. This paper presents our experiences exploring techniques to disrupt style mimicry on video imagery. We first validate that mimicry attacks can succeed by training on individual frames extracted from videos. We show that while anti-mimicry tools can offer protection when applied to individual frames, this approach is vulnerable to an adaptive countermeasure that removes protection by exploiting randomness in optimization results of consecutive (nearly-identical) frames. We develop a new, tool-agnostic framework that segments videos into short scenes based on frame-level similarity, and use a per-scene optimization baseline to remove inter-frame randomization while reducing computational cost. We show via both image level metrics and an end-to-end user study that the resulting protection restores protection against mimicry (including the countermeasure). Finally, we develop another adaptive countermeasure and find that it falls short against our framework.
- Joel Abrams. 2020. 3.2 billion images and 720,000 hours of video are shared online daily. Can you sort real from fake? theconversation.com.
- Sarah Andersen. 2022. The Alt-Right Manipulated My Comic. Then A.I. Claimed It. New York Times.
- Andrew Reiner. 2022. Genshin Impact Reveals New Characters And Region In Teaser Trailer. https://www.gameinformer.com/2022/07/29/genshin-impact-reveals-new-characters-and-region-in-teaser-trailer.
- Blair Attard-Frost. 2023. Generative AI Systems: Impacts on Artists & Creators and Related Gaps in the Artificial Intelligence and Data Act. SSRN 4468637 (2023).
- Andy Baio. 2022. Invasive Diffusion: How one unwilling illustrator found herself turned into an AI model.
- Banning Media LLC. 2024. FanCaps.net Movie, TV, Anime Images, Screencaps, Screenshots, Wallpapers. https://fancaps.net/.
- Depth-aware video frame interpolation. In Proc. of CVPR.
- Stable video diffusion: Scaling latent video diffusion models to large datasets. arXiv preprint arXiv:2311.15127 (2023).
- Brandon Castellano. 2024. PySceneDetect. https://www.scenedetect.com/.
- Video generation models as world simulators. (2024). https://openai.com/research/video-generation-models-as-world-simulators
- Chad Kennerk. 2024. New Trailers: DEADPOOL & WOLVERINE, BLINK TWICE, I SAW THE TV GLOW, and More. https://www.boxofficepro.com/new-trailers-deadpool-wolverine-blink-twice-i-saw-the-tv-glow-and-more/.
- Civitai. 2022. https://civitai.com.
- Crunchyroll. 2023. FEATURE: How Anime Gets Animated. https://www.crunchyroll.com/news/deep-dives/2023/3/22/feature-how-anime-gets-animated.
- cyber meow. 2023. Anime2SD. https://github.com/cyber-meow/anime_screenshot_pipeline.
- Benj Edwards. 2022. Artists stage mass protest against AI-generated artwork on ArtStation. Ars Technica.
- EdXD. 2022. How to Use DreamBooth to Fine-Tune Stable Diffusion (Colab). https://bytexd.com/how-to-use-dreambooth-to-fine-tune-stable-diffusion-colab/.
- Rinon Gal et al. 2022. An image is worth one word: Personalizing text-to-image generation using textual inversion. arXiv preprint arXiv:2208.01618 (2022).
- Image style transfer using convolutional neural networks. In Proc. of CVPR. 2414–2423.
- Gavin Sheehan. 2023. League Of Legends Reveals All-New Artistic Hero Named Hwei. https://bleedingcool.com/games/league-of-legends-reveals-all-new-artistic-hero-named-hwei/.
- Glaze Website 2023. https://glaze.cs.uchicago.edu/aboutus.html.
- Jeff Hayward. 2022. Artists/Photographers: Your Low-Res Images Aren’t Safe From AI. https://medium.com/counterarts/artists-photographers-your-low-res-images-arent-safe-from-ai-3f57bd4d7f63.
- Real-time intermediate flow estimation for video frame interpolation. Springer, 624–642.
- Haroon Idrees et al. 2017. The THUMOS challenge on action recognition for videos “in the wild”. Proc. of CVIU (2017).
- AI Art and its Impact on Artists. In Proc. of AIES.
- Lexica. 2022. https://lexica.art/.
- Blip: Bootstrapping language-image pre-training for unified vision-language understanding and generation. In Proc. of ICML.
- Chumeng Liang et al. 2023. Adversarial example does good: Preventing painting imitation from diffusion models via adversarial examples. In Proc. of ICML.
- Maximax67. 2023. Anime LoRA Dataset Automaker using original screencaps, face detection and similarity models. https://civitai.com/articles/1746/anime-lora-dataset-automaker-using-original-screencaps-face-detection-and-similarity-models.
- Brendan P. Murphy. 2022. Is Lensa AI Stealing From Human Art? An Expert Explains The Controversy. ScienceAlert.
- Simon Niklaus and Feng Liu. 2020. Softmax splatting for video frame interpolation. In Proc. of CVPR. 5437–5446.
- NovelAI. 2022. NovelAI changelog. https://novelai.net/updates.
- University of Silicon Valley. 2022. How is an Animated Film Produced? https://usv.edu/blog/how-is-an-animated-film-produced/.
- U.S. Copyright Office. 2023. U.S. Copyright Office Fair Use Index. https://www.copyright.gov/fair-use/.
- Dustin Podell et al. 2023. SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis. arXiv:2307.01952 (2023).
- pysker-team. 2024. mist-v2. https://github.com/psyker-team/mist-v2.
- Aditya Ramesh et al. 2022. Hierarchical text-conditional image generation with clip latents. arXiv:2204.06125 (2022).
- Fitsum Reda et al. 2022. Film: Frame interpolation for large motion.
- Nataniel Ruiz et al. 2022. DreamBooth: Fine Tuning Text-to-image Diffusion Models for Subject-Driven Generation. arxiv:2208.12242 (2022).
- Hidden trigger backdoor attacks. In Proc. of AAAI.
- Scenario.gg. 2022. AI-generated game assets.
- Toronto Film School. 2024. What is Video Game Animation and How Does it Work? https://www.torontofilmschool.ca/blog/what-is-video-game-animation-and-how-does-it-work/.
- Christoph Schuhmann et al. 2022. Laion-5b: An open large-scale dataset for training next generation image-text models. Proc. of NeurIPS (2022).
- Glaze: Protecting artists from style mimicry by text-to-image models. In Proc. of USENIX Security.
- Prompt-specific poisoning attacks on text-to-image generative models. In arXiv preprint arXiv:2310.13828.
- Stability AI. 2022. Stable Diffusion Public Release. . https://stability.ai/blog/stable-diffusion-public-release.
- Stability AI. 2023. Stability AI releases DeepFloyd IF, a powerful text-to-image model that can smartly integrate text into images. https://stability.ai/blog/deepfloyd-if-text-to-image-model.
- Clean-label backdoor attacks. (2018).
- Enes Sadi Uysal. 2024. Fine-Tuning Stable Diffusion with DreamBooth Method.
- Anti-DreamBooth: Protecting users from personalized text-to-image synthesis. In Proc. of ICCV.
- Webglaze 2022. https://glaze.cs.uchicago.edu/webglaze.html.
- Sam Yang. 2022. Why Artists are Fed Up with AI Art. Fayden Art.
- YouTube. 2023. YouTube recommended upload encoding settings. https://support.google.com/youtube/answer/1722171.
- Transferable clean-label poisoning attacks on deep neural nets. In Proc. of ICML.
- Josephine Passananti (6 papers)
- Stanley Wu (6 papers)
- Shawn Shan (16 papers)
- Haitao Zheng (50 papers)
- Ben Y. Zhao (49 papers)