Language Guided Adversarial Purification (2309.10348v1)
Abstract: Adversarial purification using generative models demonstrates strong adversarial defense performance. These methods are classifier and attack-agnostic, making them versatile but often computationally intensive. Recent strides in diffusion and score networks have improved image generation and, by extension, adversarial purification. Another highly efficient class of adversarial defense methods known as adversarial training requires specific knowledge of attack vectors, forcing them to be trained extensively on adversarial examples. To overcome these limitations, we introduce a new framework, namely Language Guided Adversarial Purification (LGAP), utilizing pre-trained diffusion models and caption generators to defend against adversarial attacks. Given an input image, our method first generates a caption, which is then used to guide the adversarial purification process through a diffusion network. Our approach has been evaluated against strong adversarial attacks, proving its effectiveness in enhancing adversarial robustness. Our results indicate that LGAP outperforms most existing adversarial defense techniques without requiring specialized network training. This underscores the generalizability of models trained on large datasets, highlighting a promising direction for further research.
- “Explaining and harnessing adversarial examples,” in ICLR, 2015.
- “Towards deep learning models resistant to adversarial attacks,” in ICLR, 2018.
- “Pixeldefend: Leveraging generative models to understand and defend against adversarial examples,” in ICLR, 2018.
- “Defense-gan: Protecting classifiers against adversarial attacks using generative models,” in ICLR, 2018.
- “Online adversarial purification based on self-supervised learning,” in ICLR, 2020.
- “Adversarial purification with score-based generative models,” in ICML, 2021.
- “Diffusion models for adversarial purification,” in ICML, 2022.
- “(certified!!) adversarial robustness for free!,” in ICLR, 2022.
- “High-resolution image synthesis with latent diffusion models,” in CVPR, 2022.
- “Imagenet: A large-scale hierarchical image database,” in CVPR, 2009.
- “Learning multiple layers of features from tiny images,” 2009.
- “Deep unsupervised learning using nonequilibrium thermodynamics,” in ICML, 2015, pp. 2256–2265.
- “Generative modeling by estimating gradients of the data distribution,” NeurIPS, vol. 32, 2019.
- “Denoising diffusion probabilistic models,” NeurIPS, vol. 33, pp. 6840–6851, 2020.
- “Score-based generative modeling through stochastic differential equations,” in ICLR, 2020.
- “Learning transferable visual models from natural language supervision,” in ICML, 2021.
- “Blip: Bootstrapping language-image pre-training for unified vision-language understanding and generation,” in ICML, 2022.
- “Metric learning for adversarial robustness,” NeurIPS, vol. 32, 2019.
- “Self-supervised adversarial training,” in ICASSP, 2020.
- “Adversarial training for free!,” NeurIPS, vol. 32, 2019.
- “Fast is better than free: Revisiting adversarial training,” in ICLR, 2019.
- “Your classifier is secretly an energy based model and you should treat it like one,” in ICLR, 2019.
- “Robustbench: a standardized adversarial robustness benchmark,” arXiv preprint arXiv:2010.09670, 2020.
- “Deep residual learning for image recognition,” in CVPR, 2016.
- “Wide residual networks,” arXiv preprint arXiv:1605.07146, 2016.
- “Stochastic security: Adversarial defense using long-run dynamics of energy-based models,” in ICLR, 2021.
- “Implicit generation and modeling with energy based models,” in NeurIPS, 2019.
- “The enemy of my enemy is my friend: Exploring inverse adversaries for improving adversarial training,” in CVPR, 2023.
- “Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples,” in ICML, 2018.
- “Defense-VAE: A fast and accurate defense against adversarial attacks,” in Machine Learning and Knowledge Discovery in Databases, Peggy Cellier and Kurt Driessens, Eds. pp. 191–207, Springer International Publishing.
- “Me-net: Towards effective adversarial robustness with matrix estimation,” in ICML, 2019.
- “Unlabeled data improves adversarial robustness,” NeurIPS, 2019.
- “Do adversarially robust imagenet models transfer better?,” NeurIPS, 2020.
- Himanshu Singh (12 papers)
- A V Subramanyam (13 papers)