Improving Adversarial Training using Vulnerability-Aware Perturbation Budget (2403.04070v1)
Abstract: Adversarial Training (AT) effectively improves the robustness of Deep Neural Networks (DNNs) to adversarial attacks. Generally, AT involves training DNN models with adversarial examples obtained within a pre-defined, fixed perturbation bound. Notably, individual natural examples from which these adversarial examples are crafted exhibit varying degrees of intrinsic vulnerabilities, and as such, crafting adversarial examples with fixed perturbation radius for all instances may not sufficiently unleash the potency of AT. Motivated by this observation, we propose two simple, computationally cheap vulnerability-aware reweighting functions for assigning perturbation bounds to adversarial examples used for AT, named Margin-Weighted Perturbation Budget (MWPB) and Standard-Deviation-Weighted Perturbation Budget (SDWPB). The proposed methods assign perturbation radii to individual adversarial samples based on the vulnerability of their corresponding natural examples. Experimental results show that the proposed methods yield genuine improvements in the robustness of AT algorithms against various adversarial attacks.
- Square attack: a query-efficient black-box adversarial attack via random search. In European Conference on Computer Vision, pp. 484–501. Springer, 2020.
- Towards evaluating the robustness of neural networks. In 2017 ieee symposium on security and privacy (sp), pp. 39–57. Ieee, 2017.
- Minimally distorted adversarial examples with a fast adaptive boundary attack. In International Conference on Machine Learning, pp. 2196–2205. PMLR, 2020a.
- Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In International conference on machine learning, pp. 2206–2216. PMLR, 2020b.
- Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pp. 248–255. Ieee, 2009.
- Mma training: Direct input space margin maximization through adversarial training. In International Conference on Learning Representations, 2019.
- Improving adversarial robustness with hypersphere embedding and angular-based regularizations. In ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5. IEEE, 2023a.
- Vulnerability-aware instance reweighting for adversarial training. Transactions on Machine Learning Research, 2023b.
- Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572, 2014.
- Countering adversarial images using input transformations. In International Conference on Learning Representations, 2018.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778, 2016.
- Biased importance sampling for deep neural network training. arXiv preprint arXiv:1706.00043, 2017.
- Empirical margin distributions and bounding the generalization error of combined classifiers. The Annals of Statistics, 30(1):1–50, 2002.
- Alex Krizhevsky et al. Learning multiple layers of features from tiny images. 2009.
- Squeeze training for adversarial robustness. 2023.
- On the loss landscape of adversarial training: Identifying challenges and how to overcome them. Advances in Neural Information Processing Systems, 33:21476–21487, 2020.
- Probabilistic margins for instance reweighting in adversarial training. Advances in Neural Information Processing Systems, 34:23258–23269, 2021.
- Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083, 2017.
- Taps: Connecting certified and adversarial training. arXiv preprint arXiv:2305.04574, 2023.
- Reading digits in natural images with unsupervised feature learning. 2011.
- Boosting adversarial training with hypersphere embedding. Advances in Neural Information Processing Systems, 33:7779–7792, 2020.
- Practical black-box attacks against machine learning. In Proceedings of the 2017 ACM on Asia conference on computer and communications security, pp. 506–519, 2017.
- First-order adversarial vulnerability of neural networks and input dimension. In International conference on machine learning, pp. 5809–5817. PMLR, 2019.
- Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199, 2013.
- Adversarial risk and the dangers of evaluating against weak attacks. In International Conference on Machine Learning, pp. 5025–5034. PMLR, 2018.
- Improving adversarial robustness requires revisiting misclassified examples. In International conference on learning representations, 2019.
- Cfa: Class-wise calibrated fair adversarial training. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8193–8201, 2023.
- Adversarial weight perturbation helps robust generalization. Advances in Neural Information Processing Systems, 33:2958–2969, 2020.
- To be robust or to be fair: Towards fairness in adversarial training. In International conference on machine learning, pp. 11492–11501. PMLR, 2021.
- Are adversarial examples created equal? a learnable weighted minimax risk for robustness under non-uniform attacks. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pp. 10815–10823, 2021.
- Theoretically principled trade-off between robustness and accuracy. In International conference on machine learning, pp. 7472–7482. PMLR, 2019.
- Geometry-aware instance-reweighted adversarial training. In International Conference on Learning Representations, 2020.
- Olukorede Fakorede (4 papers)
- Modeste Atsague (4 papers)
- Jin Tian (46 papers)