Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Regularization properties of adversarially-trained linear regression (2310.10807v1)

Published 16 Oct 2023 in stat.ML, cs.CR, cs.LG, and math.OC

Abstract: State-of-the-art machine learning models can be vulnerable to very small input perturbations that are adversarially constructed. Adversarial training is an effective approach to defend against it. Formulated as a min-max problem, it searches for the best solution when the training data were corrupted by the worst-case attacks. Linear models are among the simple models where vulnerabilities can be observed and are the focus of our study. In this case, adversarial training leads to a convex optimization problem which can be formulated as the minimization of a finite sum. We provide a comparative analysis between the solution of adversarial training in linear regression and other regularization methods. Our main findings are that: (A) Adversarial training yields the minimum-norm interpolating solution in the overparameterized regime (more parameters than data), as long as the maximum disturbance radius is smaller than a threshold. And, conversely, the minimum-norm interpolator is the solution to adversarial training with a given radius. (B) Adversarial training can be equivalent to parameter shrinking methods (ridge regression and Lasso). This happens in the underparametrized region, for an appropriate choice of adversarial radius and zero-mean symmetrically distributed covariates. (C) For $\ell_\infty$-adversarial training -- as in square-root Lasso -- the choice of adversarial radius for optimal bounds does not depend on the additive noise variance. We confirm our theoretical findings with numerical examples.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (49)
  1. “Intriguing properties of neural networks” In International Conference on Learning Representations (ICLR), 2014
  2. Ian J. Goodfellow, Jonathon Shlens and Christian Szegedy “Explaining and Harnessing Adversarial Examples” In International Conference on Learning Representations (ICLR), 2015
  3. “Adversarial attacks and defences competition” In The NIPS ’17 competition: Building Intelligent Systems, 2018, pp. 195–231
  4. Alhussein Fawzi, Omar Fawzi and Pascal Frossard “Analysis of classifiers’ robustness to adversarial perturbations” In Machine Learning 107.3, 2018, pp. 481–508 DOI: 10.1007/s10994-017-5663-3
  5. “Adversarial Examples Are Not Bugs, They Are Features” arXiv: 1905.02175 In Advances in Neural Information Processing Systems 32, 2019 URL: http://arxiv.org/abs/1905.02175
  6. “Adversarial examples: Attacks and defenses for deep learning” In IEEE Transactions on Neural Networks and Learning Systems 30.9, 2019, pp. 2805–2824
  7. “Towards Deep Learning Models Resistant to Adversarial Attacks” In International Conference for Learning Representations (ICLR), 2018
  8. “Learning with a Strong Adversary” arXiv: 1511.03034 In arXiv:1511.03034, 2016 URL: http://arxiv.org/abs/1511.03034
  9. “Recent Advances in Adversarial Training for Adversarial Robustness” arXiv: 2102.01356 In arXiv:2102.01356, 2021 URL: http://arxiv.org/abs/2102.01356
  10. “RobustBench: a standardized adversarial robustness benchmark” arXiv:2010.09670 [cs, stat] In NeurIPS Datasets and Benchmarks track, 2021 DOI: 10.48550/arXiv.2010.09670
  11. Hossein Taheri, Ramtin Pedarsani and Christos Thrampoulidis “Asymptotic Behavior of Adversarial Training in Binary Classification” arXiv: 2010.13275 In IEEE International Symposium on Information Theory (ISIT) 127-132, 2022 DOI: 10.1109/ISIT50566.2022.9834717
  12. Adel Javanmard, Mahdi Soltanolkotabi and Hamed Hassani “Precise tradeoffs in adversarial training for linear regression” In Proceedings of the Conference on Learning Theory 125, 2020, pp. 2034–2078 URL: http://proceedings.mlr.press/v125/javanmard20a.html
  13. “The curse of overparametrization in adversarial training: Precise analysis of robust generalization for random features regression” arXiv: 2201.05149 In arXiv:2201.05149, 2022 URL: http://arxiv.org/abs/2201.05149
  14. Yifei Min, Lin Chen and Amin Karbasi “The Curious Case of Adversarially Robust Models: More Data Can Help, Double Descend, or Hurt Generalization” In Proceedings of the Conference on Uncertainty in Artificial Intelligence 161, 2021, pp. 129–139 URL: http://arxiv.org/abs/2002.11080
  15. Dong Yin, Ramchandran Kannan and Peter Bartlett “Rademacher Complexity for Adversarially Robust Generalization” In Proceeding of the International Conference on Machine Learning, 2019, pp. 7085–7094 URL: https://proceedings.mlr.press/v97/yin19b.html
  16. “Robustness May Be At Odds with Accuracy” In International Conference for Learning Representations, 2019
  17. Antônio H. Ribeiro and Thomas B. Schön “Overparameterized Linear Regression under Adversarial Attacks” arXiv: 2204.06274 In IEEE Transactions on Signal Processing, 2023 DOI: 10.1109/TSP.2023.3246228
  18. “Least angle regression” In The Annals of Statistics 32.2, 2004, pp. 407–499 DOI: 10.1214/009053604000000067
  19. Trevor Hastie, Robert Tibshirani and Jerome Friedman “Elements of Statistical Learning” Springer Science & Business Media, 2009
  20. “Understanding deep learning requires rethinking generalization” In International Conference on Learning Representations, 2017
  21. Peter L. Bartlett, Andrea Montanari and Alexander Rakhlin “Deep learning: a statistical viewpoint” arXiv: 2103.09177 In arXiv:2103.09177, 2021 URL: http://arxiv.org/abs/2103.09177
  22. “Benign overfitting in linear regression” In Proceedings of the National Academy of Sciences 117.48, 2020, pp. 30063–30070 DOI: 10.1073/pnas.1907378117
  23. “Uniform Convergence of Interpolators: Gaussian Width, Norm Bounds and Benign Overfitting” In NeurIPS, 2021 URL: https://openreview.net/forum?id=FyOhThdDBM
  24. “Reconciling modern machine-learning practice and the classical bias–variance trade-off” In Proceedings of the National Academy of Sciences 116.32, 2019, pp. 15849–15854 DOI: 10.1073/pnas.1903070116
  25. Mikhail Belkin, Daniel Hsu and Ji Xu “Two Models of Double Descent for Weak Features” arXiv: 1903.07571 In SIAM Journal on Mathematics of Data Science 2.4, 2020, pp. 1167–1180 DOI: 10.1137/20M1336072
  26. “Surprises in high-dimensional ridgeless least squares interpolation” Publisher: Institute of Mathematical Statistics In The Annals of Statistics 50.2, 2022, pp. 949–986 DOI: 10.1214/21-AOS2133
  27. Chen Dan, Yuting Wei and Pradeep Ravikumar “Sharp statistical guaratees for adversarially robust Gaussian classification” In Proceedings of the 37th international conference on machine learning 119, Proceedings of machine learning research PMLR, 2020, pp. 2345–2355 URL: https://proceedings.mlr.press/v119/dan20b.html
  28. “Provable tradeoffs in adversarially robust classification” arXiv:2006.05161 [cs, stat] arXiv, 2022 DOI: 10.48550/arXiv.2006.05161
  29. “Precise statistical analysis of classification accuracies for adversarial training” Publisher: Institute of Mathematical Statistics In The Annals of Statistics 50.4, 2022, pp. 2127–2156 DOI: 10.1214/22-AOS2180
  30. Robert Tibshirani “Regression shrinkage and selection via the LASSO” 00000 In Journal of the Royal Statistical Society. Series B (Methodological), 1996, pp. 267–288
  31. Martin J Wainwright “High-Simensional Statistics: a non-Asymptotic Viewpoint”, Cambridge series on statistical and probabilistic mathematics 48 Cambridge University Press, 2019 URL: http://gen.lib.rus.ec/book/index.php?md5=b80e7471e94d9e9a29b8b10221f70feb
  32. Alexandre Belloni, Victor Chernozhukov and Lie Wang “Square-Root Lasso: Pivotal Recovery of Sparse Signals via Conic Programming” arXiv:1009.5689 [math, stat] In Biometrika 98.4, 2011, pp. 791–806 DOI: 10.1093/biomet/asr043
  33. Huan Xu, Constantine Caramanis and Shie Mannor “Robust regression and lasso” In Advances in Neural Information Processing Systems 21, 2008
  34. Huan Xu, Constantine Caramanis and Shie Mannor “Robustness and regularization of support vector machines” In Journal of Machine Learning Research 10.51, 2009, pp. 1485–1510 URL: http://jmlr.org/papers/v10/xu09b.html
  35. Yue Xing, Qifan Song and Guang Cheng “On the Generalization Properties of Adversarial Training” ISSN: 2640-3498 In Proceedings of the International Conference on Artificial Intelligence and Statistics, 2021, pp. 505–513 URL: https://proceedings.mlr.press/v130/xing21b.html
  36. Dimitri P Bertsekas, Angelia Nedi and Asuman E Ozdaglar “Convex Analysis and Optimization”, 2003
  37. Frank H. Clarke “Optimization and Nonsmooth Analysis” tex.eprint: https://epubs.siam.org/doi/pdf/10.1137/1.9781611971309 Society for IndustrialApplied Mathematics, 1990 DOI: 10.1137/1.9781611971309
  38. “Subgradients (Lecture Notes)”, 2022
  39. Scott Shaobing Chen, David L. Donoho and Michael A. Saunders “Atomic decomposition by basis pursuit” tex.eprint: https://doi.org/10.1137/S1064827596304010 In SIAM Journal on Scientific Computing 20.1, 1998, pp. 33–61 DOI: 10.1137/S1064827596304010
  40. Francis Bach “High-dimensional analysis of double descent for linear regression with random projections” In arXiv:2303.01372, 2023 URL: http://arxiv.org/abs/2303.01372
  41. Ryan J Tibshirani “The Lasso Problem and Uniqueness” In Electronic Journal of Statistics 7, 2013, pp. 1456–1490
  42. “CVXPY: A Python-embedded modeling language for convex optimization” In Journal of Machine Learning Research 17.83, 2016, pp. 1–5
  43. “Random Features for Large-Scale Kernel Machines” In Advances in Neural Information Processing Systems 20, 2008, pp. 1177–1184
  44. “Limited haplotype diversity underlies polygenic trait architecture across 70 years of wheat breeding” In Genome Biology 22.1, 2021, pp. 137 DOI: 10.1186/s13059-021-02354-7
  45. Andrew M. Saxe, James L. McClelland and Surya Ganguli “Exact solutions to the nonlinear dynamics of learning in deep linear neural networks” arXiv: 1312.6120 In International Conference on Learning Representations (ICLR), 2014 URL: http://arxiv.org/abs/1312.6120
  46. “Deep Linear Networks with Arbitrary Loss: All Local Minima Are Global” ISSN: 2640-3498 In Proceedings of the International Conference on Machine Learning, 2018, pp. 2902–2907 URL: http://proceedings.mlr.press/v80/laurent18a.html
  47. “Wide neural networks of any depth evolve as linear models under gradient descent” Publisher: IOP Publishing In Journal of Statistical Mechanics: Theory and Experiment 2020.12, 2020, pp. 124002 DOI: 10.1088/1742-5468/abc62b
  48. Arthur Jacot, Franck Gabriel and Clément Hongler “Neural Tangent Kernel: Convergence and Generalization in Neural Networks” arXiv: 1806.07572 In Advances in Neural Information Processing Systems 31, 2018 URL: http://arxiv.org/abs/1806.07572
  49. “Improving DNN robustness to adversarial attacks using jacobian regularization” In European conference on computer vision, 2018 URL: https://api.semanticscholar.org/CorpusID:4326223
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Antônio H. Ribeiro (21 papers)
  2. Dave Zachariah (52 papers)
  3. Francis Bach (249 papers)
  4. Thomas B. Schön (132 papers)
Citations (7)