Regularization properties of adversarially-trained linear regression (2310.10807v1)
Abstract: State-of-the-art machine learning models can be vulnerable to very small input perturbations that are adversarially constructed. Adversarial training is an effective approach to defend against it. Formulated as a min-max problem, it searches for the best solution when the training data were corrupted by the worst-case attacks. Linear models are among the simple models where vulnerabilities can be observed and are the focus of our study. In this case, adversarial training leads to a convex optimization problem which can be formulated as the minimization of a finite sum. We provide a comparative analysis between the solution of adversarial training in linear regression and other regularization methods. Our main findings are that: (A) Adversarial training yields the minimum-norm interpolating solution in the overparameterized regime (more parameters than data), as long as the maximum disturbance radius is smaller than a threshold. And, conversely, the minimum-norm interpolator is the solution to adversarial training with a given radius. (B) Adversarial training can be equivalent to parameter shrinking methods (ridge regression and Lasso). This happens in the underparametrized region, for an appropriate choice of adversarial radius and zero-mean symmetrically distributed covariates. (C) For $\ell_\infty$-adversarial training -- as in square-root Lasso -- the choice of adversarial radius for optimal bounds does not depend on the additive noise variance. We confirm our theoretical findings with numerical examples.
- “Intriguing properties of neural networks” In International Conference on Learning Representations (ICLR), 2014
- Ian J. Goodfellow, Jonathon Shlens and Christian Szegedy “Explaining and Harnessing Adversarial Examples” In International Conference on Learning Representations (ICLR), 2015
- “Adversarial attacks and defences competition” In The NIPS ’17 competition: Building Intelligent Systems, 2018, pp. 195–231
- Alhussein Fawzi, Omar Fawzi and Pascal Frossard “Analysis of classifiers’ robustness to adversarial perturbations” In Machine Learning 107.3, 2018, pp. 481–508 DOI: 10.1007/s10994-017-5663-3
- “Adversarial Examples Are Not Bugs, They Are Features” arXiv: 1905.02175 In Advances in Neural Information Processing Systems 32, 2019 URL: http://arxiv.org/abs/1905.02175
- “Adversarial examples: Attacks and defenses for deep learning” In IEEE Transactions on Neural Networks and Learning Systems 30.9, 2019, pp. 2805–2824
- “Towards Deep Learning Models Resistant to Adversarial Attacks” In International Conference for Learning Representations (ICLR), 2018
- “Learning with a Strong Adversary” arXiv: 1511.03034 In arXiv:1511.03034, 2016 URL: http://arxiv.org/abs/1511.03034
- “Recent Advances in Adversarial Training for Adversarial Robustness” arXiv: 2102.01356 In arXiv:2102.01356, 2021 URL: http://arxiv.org/abs/2102.01356
- “RobustBench: a standardized adversarial robustness benchmark” arXiv:2010.09670 [cs, stat] In NeurIPS Datasets and Benchmarks track, 2021 DOI: 10.48550/arXiv.2010.09670
- Hossein Taheri, Ramtin Pedarsani and Christos Thrampoulidis “Asymptotic Behavior of Adversarial Training in Binary Classification” arXiv: 2010.13275 In IEEE International Symposium on Information Theory (ISIT) 127-132, 2022 DOI: 10.1109/ISIT50566.2022.9834717
- Adel Javanmard, Mahdi Soltanolkotabi and Hamed Hassani “Precise tradeoffs in adversarial training for linear regression” In Proceedings of the Conference on Learning Theory 125, 2020, pp. 2034–2078 URL: http://proceedings.mlr.press/v125/javanmard20a.html
- “The curse of overparametrization in adversarial training: Precise analysis of robust generalization for random features regression” arXiv: 2201.05149 In arXiv:2201.05149, 2022 URL: http://arxiv.org/abs/2201.05149
- Yifei Min, Lin Chen and Amin Karbasi “The Curious Case of Adversarially Robust Models: More Data Can Help, Double Descend, or Hurt Generalization” In Proceedings of the Conference on Uncertainty in Artificial Intelligence 161, 2021, pp. 129–139 URL: http://arxiv.org/abs/2002.11080
- Dong Yin, Ramchandran Kannan and Peter Bartlett “Rademacher Complexity for Adversarially Robust Generalization” In Proceeding of the International Conference on Machine Learning, 2019, pp. 7085–7094 URL: https://proceedings.mlr.press/v97/yin19b.html
- “Robustness May Be At Odds with Accuracy” In International Conference for Learning Representations, 2019
- Antônio H. Ribeiro and Thomas B. Schön “Overparameterized Linear Regression under Adversarial Attacks” arXiv: 2204.06274 In IEEE Transactions on Signal Processing, 2023 DOI: 10.1109/TSP.2023.3246228
- “Least angle regression” In The Annals of Statistics 32.2, 2004, pp. 407–499 DOI: 10.1214/009053604000000067
- Trevor Hastie, Robert Tibshirani and Jerome Friedman “Elements of Statistical Learning” Springer Science & Business Media, 2009
- “Understanding deep learning requires rethinking generalization” In International Conference on Learning Representations, 2017
- Peter L. Bartlett, Andrea Montanari and Alexander Rakhlin “Deep learning: a statistical viewpoint” arXiv: 2103.09177 In arXiv:2103.09177, 2021 URL: http://arxiv.org/abs/2103.09177
- “Benign overfitting in linear regression” In Proceedings of the National Academy of Sciences 117.48, 2020, pp. 30063–30070 DOI: 10.1073/pnas.1907378117
- “Uniform Convergence of Interpolators: Gaussian Width, Norm Bounds and Benign Overfitting” In NeurIPS, 2021 URL: https://openreview.net/forum?id=FyOhThdDBM
- “Reconciling modern machine-learning practice and the classical bias–variance trade-off” In Proceedings of the National Academy of Sciences 116.32, 2019, pp. 15849–15854 DOI: 10.1073/pnas.1903070116
- Mikhail Belkin, Daniel Hsu and Ji Xu “Two Models of Double Descent for Weak Features” arXiv: 1903.07571 In SIAM Journal on Mathematics of Data Science 2.4, 2020, pp. 1167–1180 DOI: 10.1137/20M1336072
- “Surprises in high-dimensional ridgeless least squares interpolation” Publisher: Institute of Mathematical Statistics In The Annals of Statistics 50.2, 2022, pp. 949–986 DOI: 10.1214/21-AOS2133
- Chen Dan, Yuting Wei and Pradeep Ravikumar “Sharp statistical guaratees for adversarially robust Gaussian classification” In Proceedings of the 37th international conference on machine learning 119, Proceedings of machine learning research PMLR, 2020, pp. 2345–2355 URL: https://proceedings.mlr.press/v119/dan20b.html
- “Provable tradeoffs in adversarially robust classification” arXiv:2006.05161 [cs, stat] arXiv, 2022 DOI: 10.48550/arXiv.2006.05161
- “Precise statistical analysis of classification accuracies for adversarial training” Publisher: Institute of Mathematical Statistics In The Annals of Statistics 50.4, 2022, pp. 2127–2156 DOI: 10.1214/22-AOS2180
- Robert Tibshirani “Regression shrinkage and selection via the LASSO” 00000 In Journal of the Royal Statistical Society. Series B (Methodological), 1996, pp. 267–288
- Martin J Wainwright “High-Simensional Statistics: a non-Asymptotic Viewpoint”, Cambridge series on statistical and probabilistic mathematics 48 Cambridge University Press, 2019 URL: http://gen.lib.rus.ec/book/index.php?md5=b80e7471e94d9e9a29b8b10221f70feb
- Alexandre Belloni, Victor Chernozhukov and Lie Wang “Square-Root Lasso: Pivotal Recovery of Sparse Signals via Conic Programming” arXiv:1009.5689 [math, stat] In Biometrika 98.4, 2011, pp. 791–806 DOI: 10.1093/biomet/asr043
- Huan Xu, Constantine Caramanis and Shie Mannor “Robust regression and lasso” In Advances in Neural Information Processing Systems 21, 2008
- Huan Xu, Constantine Caramanis and Shie Mannor “Robustness and regularization of support vector machines” In Journal of Machine Learning Research 10.51, 2009, pp. 1485–1510 URL: http://jmlr.org/papers/v10/xu09b.html
- Yue Xing, Qifan Song and Guang Cheng “On the Generalization Properties of Adversarial Training” ISSN: 2640-3498 In Proceedings of the International Conference on Artificial Intelligence and Statistics, 2021, pp. 505–513 URL: https://proceedings.mlr.press/v130/xing21b.html
- Dimitri P Bertsekas, Angelia Nedi and Asuman E Ozdaglar “Convex Analysis and Optimization”, 2003
- Frank H. Clarke “Optimization and Nonsmooth Analysis” tex.eprint: https://epubs.siam.org/doi/pdf/10.1137/1.9781611971309 Society for IndustrialApplied Mathematics, 1990 DOI: 10.1137/1.9781611971309
- “Subgradients (Lecture Notes)”, 2022
- Scott Shaobing Chen, David L. Donoho and Michael A. Saunders “Atomic decomposition by basis pursuit” tex.eprint: https://doi.org/10.1137/S1064827596304010 In SIAM Journal on Scientific Computing 20.1, 1998, pp. 33–61 DOI: 10.1137/S1064827596304010
- Francis Bach “High-dimensional analysis of double descent for linear regression with random projections” In arXiv:2303.01372, 2023 URL: http://arxiv.org/abs/2303.01372
- Ryan J Tibshirani “The Lasso Problem and Uniqueness” In Electronic Journal of Statistics 7, 2013, pp. 1456–1490
- “CVXPY: A Python-embedded modeling language for convex optimization” In Journal of Machine Learning Research 17.83, 2016, pp. 1–5
- “Random Features for Large-Scale Kernel Machines” In Advances in Neural Information Processing Systems 20, 2008, pp. 1177–1184
- “Limited haplotype diversity underlies polygenic trait architecture across 70 years of wheat breeding” In Genome Biology 22.1, 2021, pp. 137 DOI: 10.1186/s13059-021-02354-7
- Andrew M. Saxe, James L. McClelland and Surya Ganguli “Exact solutions to the nonlinear dynamics of learning in deep linear neural networks” arXiv: 1312.6120 In International Conference on Learning Representations (ICLR), 2014 URL: http://arxiv.org/abs/1312.6120
- “Deep Linear Networks with Arbitrary Loss: All Local Minima Are Global” ISSN: 2640-3498 In Proceedings of the International Conference on Machine Learning, 2018, pp. 2902–2907 URL: http://proceedings.mlr.press/v80/laurent18a.html
- “Wide neural networks of any depth evolve as linear models under gradient descent” Publisher: IOP Publishing In Journal of Statistical Mechanics: Theory and Experiment 2020.12, 2020, pp. 124002 DOI: 10.1088/1742-5468/abc62b
- Arthur Jacot, Franck Gabriel and Clément Hongler “Neural Tangent Kernel: Convergence and Generalization in Neural Networks” arXiv: 1806.07572 In Advances in Neural Information Processing Systems 31, 2018 URL: http://arxiv.org/abs/1806.07572
- “Improving DNN robustness to adversarial attacks using jacobian regularization” In European conference on computer vision, 2018 URL: https://api.semanticscholar.org/CorpusID:4326223
- Antônio H. Ribeiro (21 papers)
- Dave Zachariah (52 papers)
- Francis Bach (249 papers)
- Thomas B. Schön (132 papers)