Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
153 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

High-dimensional robust regression under heavy-tailed data: Asymptotics and Universality (2309.16476v2)

Published 28 Sep 2023 in math.ST, cond-mat.dis-nn, cs.LG, stat.ML, and stat.TH

Abstract: We investigate the high-dimensional properties of robust regression estimators in the presence of heavy-tailed contamination of both the covariates and response functions. In particular, we provide a sharp asymptotic characterisation of M-estimators trained on a family of elliptical covariate and noise data distributions including cases where second and higher moments do not exist. We show that, despite being consistent, the Huber loss with optimally tuned location parameter $\delta$ is suboptimal in the high-dimensional regime in the presence of heavy-tailed noise, highlighting the necessity of further regularisation to achieve optimal performance. This result also uncovers the existence of a transition in $\delta$ as a function of the sample complexity and contamination. Moreover, we derive the decay rates for the excess risk of ridge regression. We show that, while it is both optimal and universal for covariate distributions with finite second moment, its decay rate can be considerably faster when the covariates' second moment does not exist. Finally, we show that our formulas readily generalise to a richer family of models and data distributions, such as generalised linear estimation with arbitrary convex regularisation trained on mixture models.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (41)
  1. Classification of superstatistical features in high dimensions. arXiv:2304.02912, 2023.
  2. D. Alspach and H. Sorenson. Nonlinear Bayesian estimation using Gaussian sum approximations. IEEE Trans. Autom. Control, 17(4):439–448, 1972.
  3. Don’t just blame over-parametrization for over-confidence: Theoretical analysis of calibration in binary classification. In M. Meila and T. Zhang, editors, Proceedings of the 38th International Conference on Machine Learning, volume 139 of PMRL, pages 566–576. PMLR, 2021.
  4. Optimal errors and phase transitions in high-dimensional generalized linear models. Proc. Natl. Acad. Sci. U.S.A., 116(12):5451–5460, 2019.
  5. C. Beck. Superstatistics: Theory and applications. Contin. Mech. Thermodyn., 16, 2003.
  6. P.C. Bellec. Out-of-sample error estimate for robust m-estimators with convex penalty. arXiv:2008.11840, 2023.
  7. Asymptotic normality of robust M-estimators with convex penalty. Electron. J. Stat., 16(2):5591 – 5622, 2022. doi: 10.1214/22-EJS2065.
  8. Theoretical characterization of uncertainty in high-dimensional linear classification. Machine Learning: Science and Technology, 4(2):025029, 2023a.
  9. On double-descent in uncertainty quantification in overparametrized models. In F. Ruiz, J. Dy, and J.-W. van de Meent, editors, Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, volume 206 of PMRL, pages 7089–7125. PMLR, 2023b.
  10. The random matrix regime of maronna’s m-estimator with elliptically distributed samples. J. Multivar. Anal., 139:56–78, 2015.
  11. Second order statistics of robust estimators of scatter. application to glrt detection for elliptical signals. J. Multivar. Anal., 143:249–274, 2016. ISSN 0047-259X. doi: https://doi.org/10.1016/j.jmva.2015.08.021.
  12. D. Donoho and A. Montanari. High dimensional robust m𝑚mitalic_m-estimation: asymptotic variance via approximate message passing. Probab. Theory Relat. Fields., 166(3):935–969, 2016.
  13. N. El Karoui. Asymptotic behavior of unregularized and ridge-regularized high-dimensional robust regression estimators : rigorous results, 2013.
  14. N. El Karoui. On the impact of predictor geometry on the performance on high-dimensional ridge-regularized generalized robust regression estimators. Probab. Theory Relat. Fields, 170, 2018.
  15. On robust regression with high-dimensional predictors. Proc. Natl. Acad. Sci. U.S.A., 110(36):14557–14562, 2013.
  16. Gaussian universality of perceptrons with random labels. arXiv:2205.13303, 2022.
  17. Bayesian Nonparametrics. Springer Series in Statistics. Springer New York, 2006.
  18. F.R. Hampel. The influence curve and its role in robust estimation. J. Am. Stat. Assoc., 69(346):383–393, 1974.
  19. Robust Statistics: The Approach Based on Influence Functions. Wiley Series in Probability and Statistics. Wiley, 2011.
  20. D. Hsu and S. Sabato. Loss minimization and parameter estimation with heavy tails. J. Mach. Learn. Res., 17(18):1–40, 2016.
  21. P.J. Huber. Robust Estimation of a Location Parameter. Ann. Math. Stat., 35(1):73 – 101, 1964.
  22. P.J. Huber. Robust Regression: Asymptotics, Conjectures and Monte Carlo. Ann. Stat., 1(5):799 – 821, 1973.
  23. P.J. Huber. Robust Statistics. Wiley Series in Probability and Statistics - Applied Probability and Statistics Section Series. Wiley, 2004.
  24. A. Javanmard and A. Montanari. Debiasing the lasso: Optimal sample size for Gaussian designs. Ann. Stat., 46(6A):2593 – 2622, 2018.
  25. Learning Gaussian Mixtures with Generalized Linear Models: Precise Asymptotics in High-dimensions. Advances in Neural Information Processing Systems, 34:10144–10157, 2021.
  26. Learning curves of generic features maps for realistic datasets with a teacher-student model. J. Stat. Mech.: Theory Exp., 2022(11):114001, 2022.
  27. G. Lugosi and S. Mendelson. Mean estimation and regression under heavy-tailed distributions: A survey. Found. Comput. Math., 19(5):1145–1190, 2019.
  28. Robust Statistics: Theory and Methods (with R). Wiley Series in Probability and Statistics. Wiley, 2019.
  29. Universal series induced by approximate identities and some relevant applications. J. Approx. Theory, 163(12):1783–1797, 2011.
  30. Robust regression with covariate filtering: Heavy tails and adversarial contamination. arXiv:2009.12976, 2021.
  31. Are Gaussian data all you need? The extents and limits of universality in high-dimensional generalized linear estimation. In A. Krause et al., editors, Proceedings of the 40th International Conference on Machine Learning, volume 202 of Proceedings of Machine Learning Research, pages 27680–27708. PMLR, 23–29 Jul 2023.
  32. Robust Regression and Outlier Detection. Wiley Series in Probability and Statistics. Wiley, 2005.
  33. T. Sasai. Robust and sparse estimation of linear regression coefficients with heavy-tailed noises and covariates, 2022.
  34. Adaptive huber regression. J. Am. Stat. Assoc., 115(529):254–265, 2020.
  35. P. Sur and E.J. Candès. A modern maximum-likelihood theory for high-dimensional logistic regression. 116(29):14516–14525, 2019.
  36. P. Sur and E.J. Candès. The phase transition for the existence of the maximum likelihood estimate in high-dimensional logistic regression. Ann. Stat., 48(1):27 – 42, 2020. doi: 10.1214/18-AOS1789.
  37. Precise error analysis of regularized m𝑚mitalic_m -estimators in high dimensions. IEEE Trans. Inf. Theory, 64(8):5592–5628, 2018.
  38. A.W. van der Vaart. Asymptotic Statistics. Asymptotic Statistics. Cambridge University Press, 2000.
  39. Asymptotic characterisation of robust empirical risk minimisation performance in the presence of outliers. arXiv:2305.18974, 2023.
  40. Scale mixtures of gaussians and the statistics of natural images. In S. Solla, T. Leen, and K. Müller, editors, Advances in Neural Information Processing Systems, volume 12. MIT Press, 1999.
  41. The asymptotic distribution of the MLE in high-dimensional logistic models: Arbitrary covariance. Bernoulli, 28(3):1835 – 1861, 2022.
Citations (6)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com