Reducing Adversarial Training Cost with Gradient Approximation (2309.09464v3)
Abstract: Deep learning models have achieved state-of-the-art performances in various domains, while they are vulnerable to the inputs with well-crafted but small perturbations, which are named after adversarial examples (AEs). Among many strategies to improve the model robustness against AEs, Projected Gradient Descent (PGD) based adversarial training is one of the most effective methods. Unfortunately, the prohibitive computational overhead of generating strong enough AEs, due to the maximization of the loss function, sometimes makes the regular PGD adversarial training impractical when using larger and more complicated models. In this paper, we propose that the adversarial loss can be approximated by the partial sum of Taylor series. Furthermore, we approximate the gradient of adversarial loss and propose a new and efficient adversarial training method, adversarial training with gradient approximation (GAAT), to reduce the cost of building up robust models. Additionally, extensive experiments demonstrate that this efficiency improvement can be achieved without any or with very little loss in accuracy on natural and adversarial examples, which show that our proposed method saves up to 60\% of the training time with comparable model test accuracy on MNIST, CIFAR-10 and CIFAR-100 datasets.
- Neural Taylor approximations: Convergence and exploration in rectifier networks. In ICML, 2017.
- Curriculum adversarial training. In IJCAI, 2018.
- Certified adversarial robustness via randomized smoothing. In ICML, 2019.
- Optimizing loss functions through multivariate Taylor polynomial parameterization. arXiv preprint arXiv:2002.00059, 2020.
- Explaining and harnessing adversarial examples. In ICLR, 2015.
- Improving the affordability of robustness training for DNNs. In CVPR, 2020.
- Deep residual learning for image recognition. arXiv preprint arXiv:1512.03385, 2015.
- Using pre-training can improve model robustness and uncertainty. In ICML, 2019.
- Adversarial machine learning at scale. arXiv preprint arXiv:1611.01236, 2016.
- Adversarial examples in the physicalworld. In ICLR Workshop Track Proceedings, 2017.
- Towards understanding fast adversarial training. arXiv preprint arXiv:2006.03089, 2020.
- Towards deep learning models resistant to adversarial attacks. In ICLR, 2018.
- J. Mathews and K. Fink. Numerical Methods Using MATLAB. Prentice Hall, New Jersey, 2005.
- Improving adversarial robustness via promoting ensemble diversity. In ICML, 2019.
- Overfitting in adversarially robust deep learning. In ICML, 2020.
- Provably robust deep learning via adversarially trained smoothed classifiers. In NeurIPS, 2019.
- Adversarial training for free! In NeurIPS, 2019.
- Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
- Second-order provable defenses against adversarial attacks. In ICML, 2020.
- Understanding impacts of high-order loss approximations and features in deep learning interpretation. In ICML, 2019.
- Ensemble methods as a defense to adversarial perturbations against deep neural networks. arXiv preprint arXiv:1709.03423, 2017.
- Intriguing properties of neural networks. In ICLR, 2014.
- Ensemble adversarial training: Attacks and defenses. In ICLR, 2018.
- On the convergence and robustness of adversarial training. In ICML, 2019.
- Provable defenses against adversarial examples via the convex outer adversarial polytope. In ICML, 2018.
- Fast is better than free: Revisiting adversarial training. In ICLR, 2020.
- Wide residual networks. arXiv preprint arXiv:1605.07146, 2016.
- Euler-precision ZFD formula 3Ng𝑔gitalic_gPFD_G extended to future minimization with theoretical guarantees and numerical experiments. In IAEAC, 2017.
- Stepsize range and optimal value for Taylor–Zhang discretization formula applied to zeroing neurodynamics illustrated via future equality-constrained quadratic programming. IEEE Transactions on Neural Networks and Learning Systems, 2019.