Strong Transferable Adversarial Attacks via Ensembled Asymptotically Normal Distribution Learning (2209.11964v2)
Abstract: Strong adversarial examples are crucial for evaluating and enhancing the robustness of deep neural networks. However, the performance of popular attacks is usually sensitive, for instance, to minor image transformations, stemming from limited information -- typically only one input example, a handful of white-box source models, and undefined defense strategies. Hence, the crafted adversarial examples are prone to overfit the source model, which hampers their transferability to unknown architectures. In this paper, we propose an approach named Multiple Asymptotically Normal Distribution Attacks (MultiANDA) which explicitly characterize adversarial perturbations from a learned distribution. Specifically, we approximate the posterior distribution over the perturbations by taking advantage of the asymptotic normality property of stochastic gradient ascent (SGA), then employ the deep ensemble strategy as an effective proxy for Bayesian marginalization in this process, aiming to estimate a mixture of Gaussians that facilitates a more thorough exploration of the potential optimization space. The approximated posterior essentially describes the stationary distribution of SGA iterations, which captures the geometric information around the local optimum. Thus, MultiANDA allows drawing an unlimited number of adversarial perturbations for each input and reliably maintains the transferability. Our proposed method outperforms ten state-of-the-art black-box attacks on deep learning models with or without defenses through extensive experiments on seven normally trained and seven defense models.
- Considerations in assuring safety of increasingly autonomous systems. Technical report, 2018. NASA technical report.
- Concrete problems in ai safety. arXiv preprint arXiv:1606.06565, 2016.
- Stochastic simulation: algorithms and analysis. Springer Science & Business Media, 2007.
- Synthesizing robust adversarial examples. In International Conference on Machine Learning, pages 284–293. PMLR, 2018.
- Statistical inference for model parameters in stochastic gradient descent. arXiv preprint arXiv:1610.08637, 2016.
- Certified adversarial robustness via randomized smoothing. In International Conference on Machine Learning, pages 1310–1320. PMLR, 2019.
- Admix: Enhancing the transferability of adversarial attacks. In International Conference on Computer Vision, 2021.
- Boosting adversarial attacks with momentum. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 9185–9193, 2018.
- Evading defenses to transferable adversarial examples by translation-invariant attacks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4312–4321, 2019.
- Adversarial distributional training for robust deep learning. Advances in Neural Information Processing Systems, 33:8270–8283, 2020.
- An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations, 2020.
- José M Faria. Machine learning safety: An overview. In Proceedings of the 26th Safety-Critical Systems Symposium, York, UK, 2018.
- Fda: Feature disruptive attack. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 8069–8079, 2019.
- Patch-wise attack for fooling deep neural network. In European Conference on Computer Vision, pages 307–322. Springer, 2020.
- Explaining and harnessing adversarial examples. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, 2015.
- Identity mappings in deep residual networks. In European Conference on Computer Vision, pages 630–645. Springer, 2016.
- Rethinking spatial dimensions of vision transformers. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 11936–11945, 2021.
- Enhancing adversarial example transferability with an intermediate level attack. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 4733–4742, 2019.
- Yi Huang and Adams Wai-Kin Kong. Transferable adversarial attack based on integrated gradients. arXiv preprint arXiv:2205.13152, 2022.
- Averaging weights leads to wider optima and better generalization. In 34th Conference on Uncertainty in Artificial Intelligence 2018, UAI 2018, pages 876–885, 2018.
- Imagenet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems, 25:1097–1105, 2012.
- Adversarial examples in the physical world. arXiv preprint arXiv:1607.02533, 2016.
- Adversarial machine learning at scale. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings. OpenReview.net, 2017.
- Simple and scalable predictive uncertainty estimation using deep ensembles. Advances in Neural Information Processing Systems, 30, 2017.
- Visualizing the loss landscape of neural nets. Advances in Neural Information Processing Systems, 31, 2018.
- NATTACK: Learning the distributions of adversarial examples for an improved black-box attack on deep neural networks. In Proceedings of the 36th International Conference on Machine Learning, pages 3866–3876. PMLR, 2019.
- Defense against adversarial attacks using high-level representation guided denoiser. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1778–1787, 2018.
- Nesterov accelerated gradient and scale invariance for adversarial attacks. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. OpenReview.net, 2020.
- Swin transformer: Hierarchical vision transformer using shifted windows. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pages 9992–10002, 2021.
- No need to worry about adversarial examples in object detection in autonomous vehicles. arXiv preprint arXiv:1707.03501, 2017.
- Foveation-based mechanisms alleviate adversarial examples. arXiv preprint arXiv:1511.06292, 2015.
- A simple baseline for bayesian uncertainty in deep learning. Advances in Neural Information Processing Systems, 32:13153–13164, 2019.
- Towards deep learning models resistant to adversarial attacks. In International Conference on Learning Representations, 2018.
- Stochastic gradient descent as approximate bayesian inference. Journal of Machine Learning Research, 18:1–35, 2017.
- Task-generalizable adversarial attack based on perceptual metric. arXiv preprint arXiv:1811.09020, 2018.
- A self-supervised approach for adversarial robustness. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 262–271, 2020.
- NeurIPS2017. NIPS 2017: Non-targeted Adversarial Attack, 2017. https://www.kaggle.com/c/nips-2017-non-targeted-adversarial-attack/.
- Practical black-box attacks against machine learning. In Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security, pages 506–519, 2017.
- Acceleration of stochastic approximation by averaging. SIAM journal on Control and Optimization, 30(4):838–855, 1992.
- Learning transferable visual models from natural language supervision. In International Conference on Machine Learning, pages 8748–8763. PMLR, 2021.
- David Ruppert. Efficient estimations from a slowly convergent robbins-monro process. Technical report, Cornell University Operations Research and Industrial Engineering, 1988.
- ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV), 115(3):211–252, 2015.
- Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
- Intriguing properties of neural networks. In 2nd International Conference on Learning Representations, ICLR 2014, 2014.
- Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2818–2826, 2016.
- Inception-v4, inception-resnet and the impact of residual connections on learning. In Thirty-first AAAI Conference on Artificial Intelligence, 2017.
- Training data-efficient image transformers & distillation through attention. In International conference on machine learning, pages 10347–10357. PMLR, 2021.
- Ensemble adversarial training: Attacks and defenses. arXiv preprint arXiv:1705.07204, 2017.
- Laurens van der Maaten and Geoffrey Hinton. Visualizing data using t-sne. Journal of Machine Learning Research, 9(86):2579–2605, 2008.
- AW Van der Vaart. Cambridge series in statistical and probabilistic mathematics. Asymptotics Statistics, 1998.
- Enhancing the transferability of adversarial attacks through variance tuning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1924–1933, 2021.
- Feature importance-aware transferable adversarial attacks. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 7639–7648, 2021.
- Bayesian deep learning and a probabilistic perspective of generalization. Advances in Neural Information Processing Systems, 33:4697–4708, 2020.
- Skip connections matter: On the transferability of adversarial examples generated with resnets. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. OpenReview.net, 2020.
- Understanding and enhancing the transferability of adversarial examples. arXiv preprint arXiv:1802.09707, 2018.
- Improving transferability of adversarial examples with input diversity. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2730–2739, 2019.
- Beyond imagenet attack: Towards crafting adversarial examples for black-box domains. In International Conference on Learning Representations, 2022.
- Intelligent networking in adversarial environment: challenges and opportunities. Science China Information Sciences, 65(7):170301, 2022.
- Transferable adversarial perturbations. In Proceedings of the European Conference on Computer Vision (ECCV), pages 452–467, 2018.
- Rethinking adversarial transferability from a data distribution perspective. In International Conference on Learning Representations, 2021.
- Improving the transferability of adversarial examples with resized-diverse-inputs, diversity-ensemble and region fitting. In European Conference on Computer Vision, pages 563–579. Springer, 2020.