Fast is better than free: Revisiting adversarial training (2001.03994v1)

Published 12 Jan 2020 in cs.LG and stat.ML

Abstract: Adversarial training, a method for learning robust deep networks, is typically assumed to be more expensive than traditional training due to the necessity of constructing adversarial examples via a first-order method like projected gradient decent (PGD). In this paper, we make the surprising discovery that it is possible to train empirically robust models using a much weaker and cheaper adversary, an approach that was previously believed to be ineffective, rendering the method no more costly than standard training in practice. Specifically, we show that adversarial training with the fast gradient sign method (FGSM), when combined with random initialization, is as effective as PGD-based training but has significantly lower cost. Furthermore we show that FGSM adversarial training can be further accelerated by using standard techniques for efficient training of deep networks, allowing us to learn a robust CIFAR10 classifier with 45% robust accuracy to PGD attacks with $\epsilon=8/255$ in 6 minutes, and a robust ImageNet classifier with 43% robust accuracy at $\epsilon=2/255$ in 12 hours, in comparison to past work based on "free" adversarial training which took 10 and 50 hours to reach the same respective thresholds. Finally, we identify a failure mode referred to as "catastrophic overfitting" which may have caused previous attempts to use FGSM adversarial training to fail. All code for reproducing the experiments in this paper as well as pretrained model weights are at https://github.com/locuslab/fast_adversarial.

Authors (3)

Eric Wong (47 papers)
Leslie Rice (4 papers)
J. Zico Kolter (151 papers)

Citations (1,108)

View on Semantic Scholar

Summary

The paper demonstrates that FGSM adversarial training with random initialization achieves robustness on par with PGD-based methods.
The paper introduces cyclic learning rates and mixed-precision training to cut training time, exemplified by reaching 45% robust accuracy in minutes on CIFAR10.
The paper addresses catastrophic overfitting by applying early stopping on robust performance, ensuring effective model generalization.

Revisiting FGSM Adversarial Training for Efficient Robust Learning

The paper, titled "Fast is better than free: Revisiting adversarial training" by Eric Wong, Leslie Rice, and J. Zico Kolter, presents a reevaluation of adversarial training methodologies with a particular focus on enhancing the efficiency of adversarial robustness training. The key highlight is the revival of the Fast Gradient Sign Method (FGSM) adversarial training with random initialization, which historically has been disregarded in favor of more computationally expensive methods like Projected Gradient Descent (PGD).

Summary

The core proposition of the paper is twofold:

Reassessment of FGSM: The authors establish that adversarial training using FGSM, augmented by random initialization, is as effective as PGD-based training. This contradicts previous assumptions that FGSM is substantially inferior in generating robust models.
Efficiency Enhancements: By leveraging advanced training techniques such as cyclic learning rates and mixed-precision arithmetic, the paper demonstrates significant reductions in training time while maintaining high adversarial robustness.

The empirical results underscore the practicality of these claims by revealing that the proposed FGSM-based training achieves remarkable speedups over existing methods. Specifically, robust CIFAR10 classifiers can be trained to achieve 45% robust accuracy in just 6 minutes, whereas traditional PGD-based methods require 80 hours, and the more recent "free" adversarial training methods take approximately 10 hours.

Key Findings

Effectiveness of Random Initialization: A pivotal discovery of this work is that initiating FGSM attacks from a non-zero random point within the allowable perturbation range markedly enhances the robustness of the trained models. This simple modification ensures that models trained with FGSM achieve comparable robustness levels to those trained with more sophisticated PGD attacks.
Handling of Catastrophic Overfitting: The paper identifies and addresses a failure mode known as "catastrophic overfitting", which may arise when the trained model overfits to weak adversarial examples generated by FGSM. The authors propose using early stopping based on robust performance measured on a small subset of the training data to mitigate this issue.
Adoption of DAWNBench Techniques: By incorporating techniques from fast training competitions such as DAWNBench, including cyclic learning rates and mixed-precision arithmetic, the training process becomes significantly more efficient. For instance, employing these techniques enables the training of an $\ell_\infty$ robust ImageNet classifier in just 12 hours, compared to the 50 hours required by preceding methodologies.

Detailed Insights

Training Time vs. Robustness: Extensive experimentation shows that using FGSM with random initialization and appropriate step sizes leads to robust models. The robust accuracies achieved using FGSM-based training and evaluated against strong PGD attacks (with multiple restarts) were on par with those obtained from PGD-based and free adversarial training.
Epoch Requirements: For CIFAR10, FGSM adversarial training achieved the benchmark robust accuracy of 45% within 15 epochs, whereas PGD-based training took 40 epochs and free adversarial training required 96 epochs. The cyclic learning rate proved crucial in minimizing the number of training epochs without compromising model performance.

Implications and Future Directions

Practical Implications:

The work significantly lowers the barrier to training robust models by slashing the computational cost and time. This will likely democratize adversarial training, enabling broader adoption and experimentation across varied applications.
For practitioners, the paper provides a clear methodology to integrate into existing pipelines, leveraging readily available techniques like cyclic learning rates and mixed-precision training.

Theoretical Implications:

The findings stimulate further exploration into why simpler adversarial attacks (like FGSM with random initialization) suffice to achieve robust models, challenging the prevailing view that stronger adversaries are always necessary during training.
Future work might explore understanding the dynamics of adversarial training landscapes, particularly why overfitting to adversarial examples occurs and how to generalize the prevention methods.

Speculative Future Developments:

Advanced optimization techniques might emerge based on the insights from this work, focusing on refining adversarial training without substantially increasing computational overhead.
There could be further intersections of this research with fields seeking computational efficiency, such as federated learning or edge AI, where the reduced training times could be particularly beneficial.

Conclusion

The paper by Wong, Rice, and Kolter sheds new light on adversarial training methodologies, challenging the need for computationally expensive methods by demonstrating that FGSM, when enhanced with random initialization and contemporary training techniques, suffices to train robust deep networks efficiently. By achieving substantial reductions in training times, this work opens new avenues for scalable and practical robust learning in deep neural networks.

This summary condenses the paper's core contributions and discusses their broader implications, providing a valuable perspective for researchers engaged in the field of adversarial machine learning.

PDF Markdown

Fast is better than free: Revisiting adversarial training (2001.03994v1)

Summary

Revisiting FGSM Adversarial Training for Efficient Robust Learning

Summary

Key Findings

Detailed Insights

Implications and Future Directions

Conclusion

Related Papers

GitHub

YouTube