The Surprising Harmfulness of Benign Overfitting for Adversarial Robustness (2401.12236v2)
Abstract: Recent empirical and theoretical studies have established the generalization capabilities of large machine learning models that are trained to (approximately or exactly) fit noisy data. In this work, we prove a surprising result that even if the ground truth itself is robust to adversarial examples, and the benignly overfitted model is benign in terms of the standard'' out-of-sample risk objective, this benign overfitting process can be harmful when out-of-sample data are subject to adversarial manipulation. More specifically, our main results contain two parts: (i) the min-norm estimator in overparameterized linear model always leads to adversarial vulnerability in the
benign overfitting'' setting; (ii) we verify an asymptotic trade-off result between the standard risk and the ``adversarial'' risk of every ridge regression estimator, implying that under suitable conditions these two items cannot both be small at the same time by any single choice of the ridge regularization parameter. Furthermore, under the lazy training regime, we demonstrate parallel results on two-layer neural tangent kernel (NTK) model, which align with empirical observations in deep neural networks. Our finding provides theoretical insights into the puzzling phenomenon observed in practice, where the true target function (e.g., human) is robust against adverasrial attack, while beginly overfitted neural networks lead to models that are not robust.
- The neural tangent kernel in high dimensions: Triple descent and a multi-scale theory of generalization. In International Conference on Machine Learning, pages 74–84. PMLR.
- Bai, Z. D. (2008). Methodologies in spectral analysis of large dimensional random matrices, a review. In Advances in statistics, pages 174–240. World Scientific.
- Benign overfitting in linear regression. arXiv preprint arXiv:1906.11300v3.
- Reconciling modern machine-learning practice and the classical bias–variance trade-off. Proceedings of the National Academy of Sciences, 116(32):15849–15854.
- Two models of double descent for weak features. SIAM Journal on Mathematics of Data Science, 2(4):1167–1180.
- To understand deep learning we need to understand kernel learning. In Proceedings of the 35th International Conference on Machine Learning.
- Evasion attacks against machine learning at test time. In Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2013, Prague, Czech Republic, September 23-27, 2013, Proceedings, Part III 13, pages 387–402. Springer.
- A law of robustness for two-layers neural networks. In Conference on Learning Theory, pages 804–820. PMLR.
- A universal law of robustness via isoperimetry. Journal of the ACM, 70(2):1–18.
- Generalization bounds of stochastic gradient descent for wide and deep neural networks. Advances in neural information processing systems, 32.
- Finite-sample analysis of interpolating linear classifiers in the overparameterized regime. The Journal of Machine Learning Research, 22(1):5721–5750.
- Benign overfitting in adversarially robust linear classification. In Uncertainty in Artificial Intelligence, pages 313–323. PMLR.
- Adversarial classification. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining.
- Sharp statistical guaratees for adversarially robust gaussian classification. In International Conference on Machine Learning, pages 2345–2355. PMLR.
- Provable tradeoffs in adversarially robust classification. IEEE Transactions on Information Theory.
- Interpolation can hurt robust generalization even when there is no noise. Advances in Neural Information Processing Systems, 34:23465–23477.
- El Karoui, N. (2010). The spectrum of kernel random matrices. The Annals of Statistics, 38(1):1–50.
- Convergence of adversarial training in overparametrized neural networks. Advances in Neural Information Processing Systems, 32.
- Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572.
- The curse of overparametrization in adversarial training: Precise analysis of robust generalization for random features regression. arXiv preprint arXiv:2201.05149.
- Surprises in high-dimensional ridgeless least squares interpolation. The Annals of Statistics, 50(2):949–986.
- Exploring architectural ingredients of adversarially robust deep neural networks. Advances in Neural Information Processing Systems, 34:5545–5559.
- Adversarial examples are not bugs, they are features. Advances in Neural Information Processing Systems, 32.
- Neural tangent kernel: Convergence and generalization in neural networks. Advances in Neural Information Processing Systems, 31.
- Precise tradeoffs in adversarial training for linear regression. In Conference on Learning Theory, pages 2034–2078. PMLR.
- On large-batch training for deep learning: Generalization gap and sharp minima. arXiv preprint arXiv:1609.04836.
- Concentration inequalities and moment bounds for sample covariance operators. Bernoulli, 23(1):110–133.
- On the adversarial robustness of robust estimators. IEEE Transactions on Information Theory, 66(8):5097–5109.
- Towards an understanding of benign overfitting in neural networks. arXiv preprint arXiv:2106.03212.
- Just interpolate: kernel “ridgeless” regression can generalize. Annals of Statistics, 48(3):1329–1347.
- Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083.
- Classification vs regression in overparameterized regimes: Does the loss function matter? The Journal of Machine Learning Research, 22(1):10104–10172.
- Harmless interpolation of noisy data in regression. IEEE Journal on Selected Areas in Information Theory, 1(1):67–83.
- Deep double descent: Where bigger models and more data hurt. Journal of Statistical Mechanics: Theory and Experiment, 2021(12):124003.
- Exploring generalization in deep learning. Advances in Neural Information Processing Systems, 30.
- Path-sgd: Path-normalized optimization in deep neural networks. Advances in Neural Information Processing Systems, 28.
- In search of the real inductive bias: On the role of implicit regularization in deep learning. In ICLR Workshop.
- Adversarial training can hurt generalization. arXiv preprint arXiv:1906.06032.
- Overfitting in adversarially robust deep learning. In International Conference on Machine Learning, pages 8093–8104. PMLR.
- How benign is benign overfitting? arXiv preprint arXiv:2007.04028.
- Adversarially robust generalization requires more data. Advances in Neural Information Processing Systems, 31.
- Are adversarial examples inevitable? arXiv preprint arXiv:1809.02104.
- Shamir, O. (2022). The implicit bias of benign overfitting. In Conference on Learning Theory, pages 448–478. PMLR.
- More is better in modern machine learning: when infinite overparameterization is optimal and overfitting is obligatory. arXiv preprint arXiv:2311.14646.
- Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199.
- Telgarsky, M. (2013). Margins, shrinkage, and boosting. In International Conference on Machine Learning.
- Benign overfitting in ridge regression. Journal of Machine Learning Research, 24(123):1–76.
- Robustness may be at odds with accuracy. arXiv preprint arXiv:1805.12152.
- Vershynin, R. (2018). High-dimensional probability: An introduction with applications in data science, volume 47. Cambridge university press.
- Benign overfitting in multiclass classification: All roads lead to interpolation. IEEE Transactions on Information Theory.
- Binary classification of gaussian mixtures: Abundance of support vectors, benign overfitting, and regularization. SIAM Journal on Mathematics of Data Science, 4(1):260–284.
- Improving adversarial robustness requires revisiting misclassified examples. In International Conference on Learning Representations.
- The marginal value of adaptive gradient methods in machine learning. Advances in Neural Information Processing Systems, 30.
- Do wider neural networks really help adversarial robustness? Advances in Neural Information Processing Systems, 34:7054–7067.
- Explaining the success of adaboost and random forests as interpolating classifiers. Journal of Machine Learning Research, 18(48):1–33.
- Understanding deep learning requires rethinking generalization. In International Conference on Learning Representations.
- Theoretically principled trade-off between robustness and accuracy. In International conference on machine learning, pages 7472–7482. PMLR.
- Benign overfitting in deep neural networks under lazy training. In International Conference on Machine Learning, pages 43105–43128. PMLR.
- Provable robustness of adversarial training for learning halfspaces with noise. In International Conference on Machine Learning, pages 13002–13011. PMLR.
- Yifan Hao (28 papers)
- Tong Zhang (569 papers)