Label Noise in Adversarial Training: A Novel Perspective to Study Robust Overfitting (2110.03135v4)
Abstract: We show that label noise exists in adversarial training. Such label noise is due to the mismatch between the true label distribution of adversarial examples and the label inherited from clean examples - the true label distribution is distorted by the adversarial perturbation, but is neglected by the common practice that inherits labels from clean examples. Recognizing label noise sheds insights on the prevalence of robust overfitting in adversarial training, and explains its intriguing dependence on perturbation radius and data quality. Also, our label noise perspective aligns well with our observations of the epoch-wise double descent in adversarial training. Guided by our analyses, we proposed a method to automatically calibrate the label to address the label noise and robust overfitting. Our method achieves consistent performance improvements across various models and datasets without introducing new hyper-parameters or additional tuning.
- Square attack: a query-efficient black-box adversarial attack via random search. ArXiv, abs/1912.00049, 2020.
- Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. ArXiv, abs/1802.00420, 2018.
- Reconciling modern machine-learning practice and the classical bias–variance trade-off. Proceedings of the National Academy of Sciences, 116:15849 – 15854, 2019.
- Unlabeled data improves adversarial robustness. ArXiv, abs/1905.13736, 2019.
- Rays: A ray searching method for hard-label adversarial attack. Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2020.
- Robust overfitting may be mitigated by properly learned smoothening. In International Conference on Learning Representations, volume 1, 2021.
- Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In ICML, 2020.
- Improved regularization of convolutional neural networks with cutout. ArXiv, abs/1708.04552, 2017.
- Max-margin adversarial (mma) training: Direct input space margin maximization through adversarial training. ArXiv, abs/1812.02637, 2020.
- Data profiling for adversarial training: On the ruin of problematic data. ArXiv, abs/2102.07437, 2021a.
- Exploring memorization in adversarial training. ArXiv, abs/2106.01606, 2021b.
- Classification in the presence of label noise: A survey. IEEE Transactions on Neural Networks and Learning Systems, 25:845–869, 2014.
- Strictly proper scoring rules, prediction, and estimation. Journal of the American Statistical Association, 102:359 – 378, 2007.
- Explaining and harnessing adversarial examples. CoRR, abs/1412.6572, 2015.
- Uncovering the limits of adversarial training against norm-bounded adversarial examples. ArXiv, abs/2010.03593, 2020.
- On calibration of modern neural networks. ArXiv, abs/1706.04599, 2017.
- The elements of statistical learning. 2001.
- Identity mappings in deep residual networks. ArXiv, abs/1603.05027, 2016.
- Distilling the knowledge in a neural network. ArXiv, abs/1503.02531, 2015.
- Learning with a strong adversary. ArXiv, abs/1511.03034, 2015.
- Adversarial examples are not bugs, they are features. In NeurIPS, 2019.
- Averaging weights leads to wider optima and better generalization. ArXiv, abs/1803.05407, 2018.
- On large-batch training for deep learning: Generalization gap and sharp minima. ArXiv, abs/1609.04836, 2017.
- Krizhevsky, A. Learning multiple layers of features from tiny images. 2009.
- Adversarial machine learning at scale. ArXiv, abs/1611.01236, 2017.
- Tiny imagenet visual recognition challenge. 2015.
- Sgdr: Stochastic gradient descent with warm restarts. arXiv: Learning, 2017.
- Towards deep learning models resistant to adversarial attacks. ArXiv, abs/1706.06083, 2018.
- Robustness via curvature regularization, and vice versa. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9070–9078, 2019.
- Deep double descent: Where bigger models and more data hurt. ArXiv, abs/1912.02292, 2020.
- Reading digits in natural images with unsupervised feature learning. 2011.
- Exploring generalization in deep learning. In NIPS, 2017.
- Towards the science of security and privacy in machine learning. ArXiv, abs/1611.03814, 2016.
- Concentration inequalities for multinoulli random variables. ArXiv, abs/2001.11595, 2020.
- Fixing data augmentation to improve adversarial robustness. ArXiv, abs/2103.01946, 2021.
- Overfitting in adversarially robust deep learning. ArXiv, abs/2002.11569, 2020.
- Very deep convolutional networks for large-scale image recognition. CoRR, abs/1409.1556, 2015.
- Low curvature activations reduce overfitting in adversarial training. ArXiv, abs/2102.07861, 2021.
- Smith, L. N. Cyclical learning rates for training neural networks. 2017 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 464–472, 2017.
- Relating adversarially robust generalization to flat minima. ArXiv, abs/2104.04448, 2021.
- Is robustness the cost of accuracy? - a comprehensive study on the robustness of 18 deep image classification models. In ECCV, 2018.
- Intriguing properties of neural networks. CoRR, abs/1312.6199, 2014.
- Robustness may be at odds with accuracy. arXiv: Machine Learning, 2019.
- Adversarial risk and the dangers of evaluating against weak attacks. ArXiv, abs/1802.05666, 2018.
- Are labels required for improving adversarial robustness? ArXiv, abs/1905.13725, 2019.
- Inequalities for the l1 deviation of the empirical distribution. 2003.
- Adversarial weight perturbation helps robust generalization. arXiv: Learning, 2020.
- Rethinking bias-variance trade-off for generalization of neural networks. In ICML, 2020.
- Understanding generalization in adversarial training via the bias-variance decomposition. ArXiv, abs/2103.09947, 2021.
- Wide residual networks. ArXiv, abs/1605.07146, 2016.
- mixup: Beyond empirical risk minimization. ArXiv, abs/1710.09412, 2018.
- Theoretically principled trade-off between robustness and accuracy. ArXiv, abs/1901.08573, 2019.
- Attacks which do not kill training make adversarial learning stronger. ArXiv, abs/2002.11242, 2020.