Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Catastrophic Overfitting: A Potential Blessing in Disguise (2402.18211v1)

Published 28 Feb 2024 in cs.LG and cs.CR

Abstract: Fast Adversarial Training (FAT) has gained increasing attention within the research community owing to its efficacy in improving adversarial robustness. Particularly noteworthy is the challenge posed by catastrophic overfitting (CO) in this field. Although existing FAT approaches have made strides in mitigating CO, the ascent of adversarial robustness occurs with a non-negligible decline in classification accuracy on clean samples. To tackle this issue, we initially employ the feature activation differences between clean and adversarial examples to analyze the underlying causes of CO. Intriguingly, our findings reveal that CO can be attributed to the feature coverage induced by a few specific pathways. By intentionally manipulating feature activation differences in these pathways with well-designed regularization terms, we can effectively mitigate and induce CO, providing further evidence for this observation. Notably, models trained stably with these terms exhibit superior performance compared to prior FAT work. On this basis, we harness CO to achieve `attack obfuscation', aiming to bolster model performance. Consequently, the models suffering from CO can attain optimal classification accuracy on both clean and adversarial data when adding random noise to inputs during evaluation. We also validate their robustness against transferred adversarial examples and the necessity of inducing CO to improve robustness. Hence, CO may not be a problem that has to be solved.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (50)
  1. Scaling adversarial training to large perturbation bounds. In European Conference on Computer Vision, pages 301–316. Springer, 2022.
  2. Flammarion N Andriushchenko M. Understanding and improving fast adversarial training. In Advances in Neural Information Processing Systems, pages 16048–16059, 2020.
  3. Deep neural networks and tabular data: A survey. IEEE Transactions on Neural Networks and Learning Systems, 2022.
  4. Advdo: Realistic adversarial attacks for trajectory prediction. In European Conference on Computer Vision, pages 36–52. Springer, 2022.
  5. Towards evaluating the robustness of neural networks. In IEEE Symposium on Security and Privacy, pages 39–57, 2017.
  6. Effective backdoor defense by exploiting sensitivity of poisoned samples. Advances in Neural Information Processing Systems, 35:9727–9737, 2022.
  7. Fast gradient non-sign methods. arXiv preprint arXiv:2110.12734, 2021.
  8. How to backdoor diffusion models? In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4015–4024, 2023.
  9. Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In International Conference on Machine Learning, 2020.
  10. Make some noise: Reliable and efficient single-step adversarial training. Advances in Neural Information Processing Systems, 35:12881–12893, 2022.
  11. Boosting adversarial attacks with momentum. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 9185–9193, 2018.
  12. Zerograd: Mitigating and explaining catastrophic overfitting in fgsm adversarial training. page arXiv preprint arXiv:2103.15476, 2021.
  13. Zerograd: Costless conscious remedies for catastrophic overfitting in the fgsm adversarial training. Intelligent Systems with Applications, 19:200258, 2023.
  14. Explaining and harnessing adversarial examples. In International Conference on Learning Representations (ICLR), 2015.
  15. Segpgd: An effective and efficient adversarial attack for evaluating and boosting segmentation robustness. In European Conference on Computer Vision, pages 308–325. Springer, 2022.
  16. Cross-lingual event detection via optimized adversarial training. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 5588–5599, 2022.
  17. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
  18. Investigating catastrophic overfitting in fast adversarial training: A self-fitting perspective. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2313–2320, 2023.
  19. Fast adversarial training with adaptive step size. arXiv preprint arXiv:2206.02417, 2022.
  20. Las-at: adversarial training with learnable attack strategy. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13398–13408, 2022a.
  21. Boosting fast adversarial training with learnable adversarial initialization. IEEE Transactions on Image Processing, 31:4417–4430, 2022b.
  22. Enhancing adversarial training with second-order statistics of weights. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 15273–15283, 2022.
  23. Learning multiple layers of features from tiny images. 2009.
  24. Adversarial machine learning at scale. In International Conference on Learning Representations (ICLR), 2017.
  25. Subspace adversarial training. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13409–13418, 2022.
  26. Towards deep learning models resistant to adversarial attacks. In International Conference on Learning Representations (ICLR), 2018.
  27. When adversarial training meets vision transformers: Recipes from training to architecture. Advances in Neural Information Processing Systems, 35:18599–18611, 2022.
  28. Deep-learning seismology. Science, 377(6607):eabm4470, 2022.
  29. Fast adversarial training with noise augmentation: A unified perspective on randstart and gradalign. arXiv preprint arXiv:2202.05488, 2022.
  30. The limitations of deep learning in adversarial settings. In 2016 IEEE European symposium on security and privacy (EuroS&P), pages 372–387. IEEE, 2016.
  31. Reliably fast adversarial training via latent adversarial perturbation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 7758–7767, 2021.
  32. Sleap: A deep learning system for multi-animal pose tracking. Nature methods, 19(4):486–495, 2022.
  33. Overfitting in adversarially robust deep learning. In International Conference on Machine Learning, pages 8093–8104. PMLR, 2020.
  34. Transparency of deep neural networks for medical image analysis: A review of interpretability methods. Computers in biology and medicine, 140:105111, 2022.
  35. Adversarial training for free! Advances in Neural Information Processing Systems, 32, 2019.
  36. Sat: Improving adversarial training via curriculum-based loss smoothing. In Proceedings of the 14th ACM Workshop on Artificial Intelligence and Security, pages 25–36, 2021.
  37. Towards efficient and effective adversarial training. pages 11821–11833, 2021.
  38. Adversarial training and robustness for multiple perturbations. Advances in neural information processing systems, 32, 2019.
  39. Average gradient-based adversarial attack. IEEE Transactions on Multimedia, 2023.
  40. Noise-suppressing neural dynamics for time-dependent constrained nonlinear optimization with applications. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 52(10):6139–6150, 2022.
  41. Kolter J Z. Wong E, Rice L. Fast is better than free: Revisiting adversarial training. In International Conference on Learning Representations (ICLR), 2020.
  42. Towards efficient adversarial training on vision transformers. In European Conference on Computer Vision, pages 307–325. Springer, 2022.
  43. Stability analysis and generalization bounds of adversarial training. Advances in Neural Information Processing Systems, 35:15446–15459, 2022.
  44. Prior-guided adversarial initialization for fast adversarial training. In Proceedings of the European conference on computer vision (ECCV), 2022.
  45. Meta gradient adversarial attack. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 7748–7757, 2021.
  46. Theoretically principled trade-off between robustness and accuracy. In International conference on machine learning, pages 7472–7482. PMLR, 2019.
  47. Revisiting and advancing fast adversarial training through the lens of bi-level optimization. In International Conference on Machine Learning, pages 26693–26712. PMLR, 2022.
  48. Fast adversarial training with smooth convergence. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 4720–4729, 2023.
  49. Shadows can be dangerous: Stealthy and effective physical-world adversarial attack by natural phenomenon. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 15345–15354, 2022.
  50. Deep learning in optical metrology: a review. Light: Science & Applications, 11(1):39, 2022.
Citations (1)

Summary

We haven't generated a summary for this paper yet.