Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 152 tok/s
Gemini 2.5 Pro 54 tok/s Pro
GPT-5 Medium 25 tok/s Pro
GPT-5 High 30 tok/s Pro
GPT-4o 101 tok/s Pro
Kimi K2 203 tok/s Pro
GPT OSS 120B 431 tok/s Pro
Claude Sonnet 4.5 26 tok/s Pro
2000 character limit reached

Robust high-dimensional Gaussian and bootstrap approximations for trimmed sample means (2410.22085v1)

Published 29 Oct 2024 in math.ST, math.PR, and stat.TH

Abstract: Most of the modern literature on robust mean estimation focuses on designing estimators which obtain optimal sub-Gaussian concentration bounds under minimal moment assumptions and sometimes also assuming contamination. This work looks at robustness in terms of Gaussian and bootstrap approximations, mainly in the regime where the dimension is exponential on the sample size. We show that trimmed sample means attain - under mild moment assumptions and contamination - Gaussian and bootstrap approximation bounds similar to those attained by the empirical mean under light tails. We apply our results to study the Gaussian approximation of VC-subgraph families and also to the problem of vector mean estimation under general norms, improving the bounds currently available in the literature.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (43)
  1. Olivier Catoni. Challenging the empirical mean and empirical variance: A deviation study. Annales de l'Institut Henri Poincaré, Probabilités et Statistiques, 48(4), nov 2012. doi:10.1214/11-AIHP454.
  2. Sub-gaussian mean estimators. The Annals of Statistics, 44(6), dec 2016. doi:10.1214/16-aos1440.
  3. Stanislav Minsker. Uniform bounds for robust mean estimators. arXiv preprint arXiv:1812.03523, 2018.
  4. Sub-gaussian estimators of the mean of a random vector. The Annals of Statistics, 47(2), apr 2019a. doi:10.1214/17-AOS1639.
  5. Near-optimal mean estimators with respect to general norms. Probability Theory and Related Fields, 175(3-4):957–973, mar 2019b. doi:https://doi.org/10.1007/s00440-019-00906-4.
  6. Robust multivariate mean estimation: The optimality of trimmed mean. The Annals of Statistics, 49(1), feb 2021. doi:10.1214/20-aos1961.
  7. High-dimensional data bootstrap. Annual Review of Statistics and Its Application, 10(1):427–449, 2023a.
  8. A remark on moment-dependent phase transitions in high-dimensional gaussian approximations. Statistics & Probability Letters, 211:110149, 2024.
  9. Trimmed sample means for robust uniform mean estimation and regression. arXiv preprint arXiv:2302.06710, 2023.
  10. Gaussian approximations and multiplier bootstrap for maxima of sums of high-dimensional random vectors. The Annals of Statistics, December, 41(6):2786–2819, 2013.
  11. Empirical and multiplier bootstraps for suprema of empirical processes of increasing complexity, and related gaussian couplings. Stochastic Processes and their Applications, 126(12):3632–3651, 2016.
  12. Yuta Koike. Notes on the dimension dependence in high-dimensional central limit theorems for hyperrectangles. Japanese Journal of Statistics and Data Science, 4:257–297, 2021.
  13. Improved central limit theorem and bootstrap approximations in high dimensions. The Annals of Statistics, 50(5):2562–2586, 2022.
  14. Alexander Giessing. Gaussian and bootstrap approximations for suprema of empirical processes. arXiv preprint arXiv:2309.01307, 2023.
  15. SV Nagaev. An estimate of the remainder term in the multidimensional central limit theorem. In Proceedings of the Third Japan—USSR Symposium on Probability Theory, pages 419–438. Springer Berlin Heidelberg, 1976.
  16. Vidmantas Bentkus. On the dependence of the berry–esseen bound on dimension. Journal of Statistical Planning and Inference, 113(2):385–402, 2003.
  17. Vidmantas Bentkus. A lyapunov-type bound in rd. Theory of Probability & Its Applications, 49(2):311–323, 2005.
  18. IUV Prokhorov and V Statulevicius. Limit theorems of probability theory, volume 6. Springer Science & Business Media, 2000.
  19. B Klartag and S Sodin. Variations on the berry–esseen theorem. Theory of Probability & Its Applications, 56(3):403–419, 2012.
  20. Moment-dependent phase transitions in high-dimensional gaussian approximations. arXiv preprint arXiv:2310.12863, 2023.
  21. Aad W. van der Vaart and Jon A. Wellner. Weak Convergence and Empirical Processes. Springer New York, 1996. doi:10.1007/978-1-4757-2545-2.
  22. Optimal robust mean and location estimation via convex programs with respect to any pseudo-norms. Probability Theory and Related Fields, 183(3):997–1025, 2022.
  23. Robust machine learning by median-of-means: Theory and practice. The Annals of Statistics, 48(2), apr 2020. doi:10.1214/19-AOS1828.
  24. Recent advances in algorithmic high-dimensional robust statistics. arXiv preprint arXiv:1911.05911, 2019.
  25. Sever: A robust meta-algorithm for stochastic optimization. In Kamalika Chaudhuri and Ruslan Salakhutdinov, editors, Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pages 1596–1606. PMLR, 09–15 Jun 2019. URL https://proceedings.mlr.press/v97/diakonikolas19a.html.
  26. Learning from mom’s principles: Le cam’s approach. Stochastic Processes and their applications, 129(11):4385–4410, 2019.
  27. Peter J Huber. A robust version of the probability ratio test. The Annals of Mathematical Statistics, pages 1753–1758, 1965.
  28. Robust statistics john wiley & sons. New York, 1(1), 1981.
  29. Nearly optimal central limit theorem and bootstrap approximations in high dimensions. The Annals of Applied Probability, 33(3):2374–2425, 2023b.
  30. Gaussian approximation of suprema of empirical processes. The Annals of Statistics, 42(4):1564 – 1597, 2014a. doi:10.1214/14-AOS1230. URL https://doi.org/10.1214/14-AOS1230.
  31. Anti-concentration and honest, adaptive confidence bands. The Annals of Statistics, 42(5):1787–1818, 2014b. ISSN 00905364.
  32. High-dimensional clt for sums of non-degenerate random vectors: n−12superscript𝑛12n^{-\frac{1}{2}}italic_n start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT-rate. arXiv preprint arXiv:2009.13673, 2020.
  33. Bootstrapping max statistics in high dimensions: Near-parametric rates under weak variance decay and application to functional and multinomial data. The Annals of Statistics, 48(2):1214, 2020.
  34. Stanislav Minsker. Geometric median and robust estimation in banach spaces. Bernoulli, 21(4), nov 2015. doi:https://doi.org/10.3150/14-BEJ645.
  35. On the estimation of the mean of a random vector. Electronic Journal of Statistics, 11(1), jan 2017. doi:10.1214/17-ejs1228.
  36. Michel Talagrand. Upper and lower bounds for stochastic processes, volume 60. Springer, 2014.
  37. Improved covariance estimation: optimal robustness and sub-gaussian guarantees under heavy tails, 2022. URL https://arxiv.org/abs/2209.13485.
  38. Fedor Nazarov. On the maximal perimeter of a convex set in ℝnsuperscriptℝ𝑛\mathbb{R}^{n}blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT with respect to a gaussian measure. In Geometric Aspects of Functional Analysis: Israel Seminar 2001-2002, pages 169–187. Springer, 2003.
  39. Detailed proof of nazarov’s inequality. arXiv preprint arXiv:1711.10696, 2017.
  40. Sara van de Geer and Johannes Lederer. The bernstein–orlicz norm and deviation inequalities. Probability theory and related fields, 157(1-2):225–250, 2013.
  41. Concentration Inequalities: A Nonasymptotic Theory of Independence. Cambridge University Press, 2013.
  42. Tong Zhang. Mathematical analysis of machine learning algorithms. Cambridge University Press, 2023.
  43. A general theorem on selectors. Bull. Acad. Polon. Sci. Sér. Sci. Math. Astronom. Phys, 13(6):397–403, 1965.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (1)

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 1 tweet and received 0 likes.

Upgrade to Pro to view all of the tweets about this paper: