Instance-dependent uniform tail bounds for empirical processes (2209.10053v6)
Abstract: We formulate a uniform tail bound for empirical processes indexed by a class of functions, in terms of the individual deviations of the functions rather than the worst-case deviation in the considered class. The tail bound is established by introducing an initial deflation'' step to the standard generic chaining argument. The resulting tail bound is the sum of the complexity of the
deflated function class'' in terms of a generalization of Talagrand's $\gamma$ functional, and the deviation of the function instance, both of which are formulated based on the natural seminorm induced by the corresponding Cram\'{e}r functions. Leveraging another less demanding natural seminorm, we also show similar bounds, though with implicit dependence on the sample size, in the more general case where finite exponential moments cannot be assumed. We also provide approximations of the tail bounds in terms of the more prevalent Orlicz norms or their ``incomplete'' versions under suitable moment conditions.
- “Combining PAC-Bayesian and Generic Chaining Bounds” In Journal of Machine Learning Research 8, 2007, pp. 863–889
- Stéphane Boucheron, Gábor Lugosi and Pascal Massart “Concentration inequalities: A nonasymptotic theory of independence” Oxford: Oxford University Press, 2013
- Olivier Bousquet “A Bennett concentration inequality and its application to suprema of empirical processes” In Comptes Rendus Mathematique 334.6, 2002, pp. 495–500 DOI: https://doi.org/10.1016/S1631-073X(02)02292-6
- Clément L. Canonne “A short note on an inequality between KL and TV”, 2023 arXiv:2202.07198 [math.PR]
- E. Csáki “The law of the iterated logarithm for normalized empirical distribution function” In Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete 38.2 Springer ScienceBusiness Media LLC, 1977, pp. 147–167 DOI: 10.1007/BF00533305
- “Large Deviations Techniques and Applications” Springer Berlin Heidelberg, 2010 DOI: 10.1007/978-3-642-03311-7
- Sjoerd Dirksen “Tail bounds via generic chaining” In Electronic Journal of Probability 20 The Institute of Mathematical Statisticsthe Bernoulli Society, 2015
- X. Fernique “Regularite des trajectoires des fonctions aleatoires Gaussiennes” In Ecole d’Eté de Probabilités de Saint-Flour IV–1974 Springer Berlin Heidelberg, 1975, pp. 1–96 DOI: 10.1007/bfb0080190
- “Concentration inequalities and asymptotic results for ratio type empirical processes” In The Annals of Probability 34.3 Institute of Mathematical Statistics, 2006, pp. 1143–1216 DOI: 10.1214/009117906000000070
- Evarist Giné, Vladimir Koltchinskii and Jon A. Wellner “Ratio Limit Theorems for Empirical Processes” In Stochastic Inequalities and Applications Birkhäuser Basel, 2003, pp. 249–278 DOI: 10.1007/978-3-0348-8069-5_15
- “Rademacher Processes and Bounding the Risk of Function Learning” In High Dimensional Probability II BirkhÀuser Boston, 2000, pp. 443–457 DOI: 10.1007/978-1-4612-1358-1_29
- “Multivariate mean estimation with direction-dependent accuracy” In Journal of the European Mathematical Society European Mathematical Society - EMS - Publishing House GmbH, 2023 DOI: 10.4171/jems/1321
- David Pollard “Convergence of Stochastic Processes” Springer New York, 1984 DOI: 10.1007/978-1-4612-5254-2
- “Information Theory: From Coding to Learning” Cambridge University Press, 2024
- Michel Talagrand “Regularity of Gaussian processes” In Acta Mathematica 159.0 International Press of Boston, 1987, pp. 99–149 DOI: 10.1007/bf02392556
- Michel Talagrand “Majorizing measures without measures” In The Annals of Probability 29.1 Institute of Mathematical Statistics, 2001 DOI: 10.1214/aop/1008956336
- Michel Talagrand “Upper and Lower Bounds for Stochastic Processes” Springer Berlin Heidelberg, 2014
- Alexandre B. Tsybakov “Introduction to Nonparametric Estimation” Springer-Verlag GmbH, 2008 URL: https://www.ebook.de/de/product/12470796/alexandre_b_tsybakov_introduction_to_nonparametric_estimation.html
- “The Bernstein–Orlicz norm and deviation inequalities” In Probability Theory and Related Fields 157.1-2 Springer ScienceBusiness Media LLC, 2012, pp. 225–250 DOI: 10.1007/s00440-012-0455-y
- “Weak Convergence and Empirical Processes” Springer New York, 2012
- Ramon Handel “Chaining, interpolation, and convexity” In Journal of the European Mathematical Society 20.10 European Mathematical Society - EMS - Publishing House GmbH, 2018, pp. 2413–2435 DOI: 10.4171/JEMS/815
- “On the Uniform Convergence of Relative Frequencies of Events to Their Probabilities” In Theory of Probability & Its Applications 16.2 Society for Industrial & Applied Mathematics (SIAM), 1971, pp. 264–280 DOI: 10.1137/1116025
- Vladimir Vapnik “Statistical learning theory” New York: Wiley, 1998
- S.R.S. Varadhan “Large Deviations and Applications” Society for IndustrialApplied Mathematics, 1984 DOI: 10.1137/1.9781611970241
- Roman Vershynin “High-Dimensional Probability: An Introduction with Applications in Data Science”, Cambridge Series in Statistical and Probabilistic Mathematics Cambridge University Press, 2018
- Jon A. Wellner “The Bennett–Orlicz Norm” In Sankhya A 79.2 Springer ScienceBusiness Media LLC, 2017, pp. 355–383 DOI: 10.1007/s13171-017-0108-4
- Jon A. Wellner and Galen R. Shorack “Empirical processes with applications to statistics” Society for IndustrialApplied Mathematics, 2009
- Bin Yu “Assouad, Fano, and Le Cam” In Festschrift for Lucien Le Cam: Research Papers in Probability and Statistics New York, NY: Springer New York, 1997, pp. 423–435 DOI: 10.1007/978-1-4612-1880-7_29