Struggle with Adversarial Defense? Try Diffusion (2404.08273v3)
Abstract: Adversarial attacks induce misclassification by introducing subtle perturbations. Recently, diffusion models are applied to the image classifiers to improve adversarial robustness through adversarial training or by purifying adversarial noise. However, diffusion-based adversarial training often encounters convergence challenges and high computational expenses. Additionally, diffusion-based purification inevitably causes data shift and is deemed susceptible to stronger adaptive attacks. To tackle these issues, we propose the Truth Maximization Diffusion Classifier (TMDC), a generative Bayesian classifier that builds upon pre-trained diffusion models and the Bayesian theorem. Unlike data-driven classifiers, TMDC, guided by Bayesian principles, utilizes the conditional likelihood from diffusion models to determine the class probabilities of input images, thereby insulating against the influences of data shift and the limitations of adversarial training. Moreover, to enhance TMDC's resilience against more potent adversarial attacks, we propose an optimization strategy for diffusion classifiers. This strategy involves post-training the diffusion model on perturbed datasets with ground-truth labels as conditions, guiding the diffusion model to learn the data distribution and maximizing the likelihood under the ground-truth labels. The proposed method achieves state-of-the-art performance on the CIFAR10 dataset against heavy white-box attacks and strong adaptive attacks. Specifically, TMDC achieves robust accuracies of 82.81% against $l_{\infty}$ norm-bounded perturbations and 86.05% against $l_{2}$ norm-bounded perturbations, respectively, with $\epsilon=0.05$.
- Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee, 2009.
- Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25, 2012.
- Handwritten digit recognition with a back-propagation network. Advances in neural information processing systems, 2, 1989.
- Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1–9, 2015.
- Wide residual networks. arXiv preprint arXiv:1605.07146, 2016.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
- Attention is all you need. Advances in neural information processing systems, 30, 2017.
- Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF international conference on computer vision, pages 10012–10022, 2021.
- An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.
- Adaface: Quality adaptive margin for face recognition. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 18750–18759, 2022.
- Killing two birds with one stone: Efficient and robust training of face recognition cnns by partial fc. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4042–4051, 2022.
- Multi-scale domain-adversarial multiple-instance cnn for cancer subtype classification with unannotated histopathological images. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 3852–3861, 2020.
- Sos: Selective objective switch for rapid immunofluorescence whole slide image classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3862–3871, 2020.
- Deep semantic-visual alignment for zero-shot remote sensing image scene classification. ISPRS Journal of Photogrammetry and Remote Sensing, 198:140–152, 2023.
- Geochat: Grounded large vision-language model for remote sensing. arXiv preprint arXiv:2311.15826, 2023.
- Variational inference: A review for statisticians. Journal of the American statistical Association, 112(518):859–877, 2017.
- Your diffusion model is secretly a zero-shot classifier. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 2206–2217, 2023.
- Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In International conference on machine learning, pages 2206–2216. PMLR, 2020.
- Tmdc. https://github.com/zzhjsaiosjd/TMDC_SD_2.
- Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
- Minimally distorted adversarial examples with a fast adaptive boundary attack. In International Conference on Machine Learning, pages 2196–2205. PMLR, 2020.
- Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572, 2014.
- Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083, 2017.
- Square attack: a query-efficient black-box adversarial attack via random search. In European conference on computer vision, pages 484–501. Springer, 2020.
- Adversarial training for free! Advances in neural information processing systems, 32, 2019.
- Diffusion models for adversarial purification. arXiv preprint arXiv:2205.07460, 2022.
- Guided diffusion model for adversarial purification. arXiv preprint arXiv:2205.14969, 2022.
- Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. In International conference on machine learning, pages 274–283. PMLR, 2018.
- On discriminative vs. generative classifiers: A comparison of logistic regression and naive bayes. Advances in neural information processing systems, 14, 2001.
- Energy-based models in document recognition and computer vision. In Ninth International Conference on Document Analysis and Recognition (ICDAR 2007), volume 1, pages 337–341. IEEE, 2007.
- A tutorial on energy-based learning. Predicting structured data, 1(0), 2006.
- Your classifier is secretly an energy based model and you should treat it like one. arXiv preprint arXiv:1912.03263, 2019.
- U-net: Convolutional networks for biomedical image segmentation. In Medical image computing and computer-assisted intervention–MICCAI 2015: 18th international conference, Munich, Germany, October 5-9, 2015, proceedings, part III 18, pages 234–241. Springer, 2015.
- High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10684–10695, 2022.
- Lora: Low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685, 2021.
- xformers: A modular and hackable transformer modelling library. https://github.com/facebookresearch/xformers, 2022.
- Pivotal tuning for latent-based editing of real images. ACM Transactions on Graphics (TOG), 42(1):1–13, 2022.
- An image is worth one word: Personalizing text-to-image generation using textual inversion. arXiv preprint arXiv:2208.01618, 2022.
- Retrieval-augmented diffusion models, 2022.
- Denoising diffusion probabilistic models. Advances in neural information processing systems, 33:6840–6851, 2020.
- Variational diffusion models. Advances in neural information processing systems, 34:21696–21707, 2021.
- Score-based generative modeling through stochastic differential equations. arXiv preprint arXiv:2011.13456, 2020.
- Feature pyramid networks for object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2117–2125, 2017.
- How to train your vit? data, augmentation, and regularization in vision transformers. arXiv preprint arXiv:2106.10270, 2021.
- Learning multiple layers of features from tiny images. 2009.