On robust overfitting: adversarial training induced distribution matters (2311.16526v2)
Abstract: Adversarial training may be regarded as standard training with a modified loss function. But its generalization error appears much larger than standard training under standard loss. This phenomenon, known as robust overfitting, has attracted significant research attention and remains largely as a mystery. In this paper, we first show empirically that robust overfitting correlates with the increasing generalization difficulty of the perturbation-induced distributions along the trajectory of adversarial training (specifically PGD-based adversarial training). We then provide a novel upper bound for generalization error with respect to the perturbation-induced distributions, in which a notion of the perturbation operator, referred to "local dispersion", plays an important role. Experimental results are presented to validate the usefulness of the bound and various additional insights are provided.
- Intriguing properties of neural networks, 2014.
- Explaining and harnessing adversarial examples, 2015.
- Towards deep learning models resistant to adversarial attacks, 2019.
- Theoretically principled trade-off between robustness and accuracy. CoRR, abs/1901.08573, 2019.
- Robustbench: a standardized adversarial robustness benchmark. CoRR, abs/2010.09670, 2020.
- Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. CoRR, abs/1802.00420, 2018.
- Benchmarking adversarial robustness on image classification. In proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 321–331, 2020.
- Overfitting in adversarially robust deep learning. CoRR, abs/2002.11569, 2020.
- Learning multiple layers of features from tiny images. 2009.
- Wide residual networks. CoRR, abs/1605.07146, 2016.
- Revisiting loss landscape for adversarial robustness. CoRR, abs/2004.05884, 2020.
- Relating adversarially robust generalization to flat minima. CoRR, abs/2104.04448, 2021.
- Robust overfitting may be mitigated by properly learned smoothening. In International Conference on Learning Representations, 2021.
- Low curvature activations reduce overfitting in adversarial training. CoRR, abs/2102.07861, 2021.
- Double descent in adversarial training: An implicit label noise perspective. CoRR, abs/2110.03135, 2021.
- Exploring memorization in adversarial training. CoRR, abs/2106.01606, 2021.
- Understanding robust overfitting of adversarial training and beyond, 2022.
- Relationship between nonsmoothness in adversarial training, constraints of attacks, and flatness in the input space. IEEE Transactions on Neural Networks and Learning Systems, 2023. accepted.
- Boundary adversarial examples against adversarial overfitting, 2022.
- Adversarially robust generalization requires more data. CoRR, abs/1804.11285, 2018.
- Adversarial risk bounds via function transformation, 2019.
- Rademacher complexity for adversarially robust generalization. CoRR, abs/1810.11914, 2018.
- Adversarial learning guarantees for linear hypotheses and neural networks. CoRR, abs/2004.13617, 2020.
- Adversarial rademacher complexity of deep neural networks, 2022.
- Improved generalization bounds for robust learning. CoRR, abs/1810.02180, 2018.
- VC classes are adversarially robustly learnable, but only improperly. CoRR, abs/1902.04217, 2019.
- On the algorithmic stability of adversarial training. In A. Beygelzimer, Y. Dauphin, P. Liang, and J. Wortman Vaughan, editors, Advances in Neural Information Processing Systems, 2021.
- Stability analysis and generalization bounds of adversarial training, 2022.
- Pac-learning in the presence of evasion adversaries, 2018.
- Lower bounds for adversarially robust PAC learning. CoRR, abs/1906.05815, 2019.
- Why robust generalization in deep learning is difficult: Perspective of expressive power, 2022.
- Inductive bias of gradient descent based adversarial training on separable data. CoRR, abs/1906.02931, 2019.
- Wasserstein distributionally robust optimization: Theory and applications in machine learning. In Operations research & management science in the age of analytics, pages 130–166. Informs, 2019.
- Certifying some distributional robustness with principled adversarial training, 2020.
- Distributionally robust deep learning as a generalization of adversarial training. In NIPS workshop on Machine Learning and Computer Security, volume 3, page 4, 2017.
- A unified wasserstein distributional robustness framework for adversarial training. arXiv preprint arXiv:2202.13437, 2022.
- Certified robust neural networks: Generalization and corruption resistance. arXiv preprint arXiv:2303.02251, 2023.
- Recent advances in adversarial training for adversarial robustness, 2021.
- A survey of robust adversarial training in pattern recognition: Fundamental, theory, and methodologies, 2022.
- Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998.
- Imagenet large scale visual recognition challenge. International journal of computer vision, 115:211–252, 2015.
- Robustness may be at odds with accuracy, 2019.
- Identity mappings in deep residual networks. CoRR, abs/1603.05027, 2016.
- Pixeldefend: Leveraging generative models to understand and defend against adversarial examples. CoRR, abs/1710.10766, 2017.
- Adversarial spheres, 2018.
- Adversarial training can hurt generalization, 2019.
- On the geometry of adversarial examples, 2018.
- Disentangling adversarial robustness and generalization, 2019.
- Adversarial examples mightbe avoidable: The role of data concentration in adversarial robustness. In NeurIPS 2024, 2023.
- Adversarial robustness through local lipschitzness. CoRR, abs/2003.02460, 2020.