Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
121 tokens/sec
GPT-4o
9 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Dimension-free uniform concentration bound for logistic regression (2405.18055v5)

Published 28 May 2024 in math.ST, stat.ML, and stat.TH

Abstract: We provide a novel dimension-free uniform concentration bound for the empirical risk function of constrained logistic regression. Our bound yields a milder sufficient condition for a uniform law of large numbers than conditions derived by the Rademacher complexity argument and McDiarmid's inequality. The derivation is based on the PAC-Bayes approach with second-order expansion and Rademacher-complexity-based bounds for the residual term of the expansion.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (33)
  1. Adamczak, R. (2015). A note on the Hanson-Wright inequality for random vectors with dependencies. Electronic Communications in Probability, 20:1 – 13.
  2. Alquier, P. (2024). User-friendly Introduction to PAC-Bayes Bounds. Foundations and Trends® in Machine Learning, 17(2):174–303.
  3. Stability of decision trees and logistic regression. arXiv:1903.00816 [cs.LG]
  4. Generalization error in high-dimensional perceptrons: Approaching Bayes error with convex optimization. Advances in Neural Information Processing Systems, 33:12199–12210.
  5. Bach, F. (2023). Learning Theory from First Principles. https://www.di.ens.fr/~fbach/ltfp_book.pdf
  6. Analysis and Geometry of Markov Diffusion Operators. Grundlehren der mathematischen Wissenschaften. Springer International Publishing.
  7. Benign overfitting in linear regression. Proceedings of the National Academy of Sciences, 117(48):30063–30070.
  8. Rademacher and Gaussian complexities: Risk bounds and structural results. Journal of Machine Learning Research, 3:463–482.
  9. Concentration Inequalities: A Nonasymptotic Theory of Independence. Oxford University Press.
  10. Catoni, O. (2004). Statistical Learning Theory and Stochastic Optimization: Ecole d’Eté de Probabilités de Saint-Flour, XXXI-2001. Springer Science & Business Media.
  11. Catoni, O. (2007). PAC-Bayesian Supervised Classification: The Thermodynamics of Statistical Learning. Institute of Mathematical Statistics.
  12. Dimension-free PAC-Bayesian bounds for matrices, vectors, and linear least squares regression. arXiv:1712.02747 [math.ST]
  13. Dimension free ridge regression. arXiv:2210.08571 [math.ST]
  14. Generalization error of generalized linear models in high dimensions. In International Conference on Machine Learning, pages 2892–2901. PMLR.
  15. Giulini, I. (2018). Robust dimension-free Gram operator estimates. Bernoulli, 24(4B):3864–3923.
  16. A generalized Dantzig selector with shrinkage tuning. Biometrika, 96(2):323–337.
  17. Concentration inequalities and moment bounds for sample covariance operators. Bernoulli, 23(1):110–133.
  18. Ledoux, M. (2001). The Concentration of Measure Phenomenon. American Mathematical Society.
  19. Probability in Banach Spaces: Isoperimetry and Processes. Springer Science & Business Media.
  20. Generalization error bounds for multiclass sparse linear classifiers. Journal of Machine Learning Research, 24(151):1–35.
  21. Maximum likelihood estimation in logistic regression models with a diverging number of covariates. Electronic Journal of Statistics, 6:1838–1846.
  22. Generalization error bounds for Bayesian mixture algorithms. Journal of Machine Learning Research, 4(Oct):839–860.
  23. The impact of regularization on high-dimensional logistic regression. Advances in Neural Information Processing Systems, 32.
  24. Statistical inference in high-dimensional generalized linear models with asymmetric link functions. arXiv:2305.17731 [math.ST]
  25. A modern maximum-likelihood theory for high-dimensional logistic regression. Proceedings of the National Academy of Sciences, 116(29):14516–14525.
  26. The likelihood ratio test in high-dimensional logistic regression is asymptotically a rescaled chi-square. Probability Theory and Related Fields, 175:487–558.
  27. Benign overfitting in ridge regression. Journal of Machine Learning Research, 24(123):1–76.
  28. van de Geer, S. A. (2008). High-dimensional generalized linear models and the lasso. The Annals of Statistics, 36(1):614–645.
  29. van der Vaart, A. W. (2000). Asymptotic Statistics. Cambridge University Press.
  30. Vershynin, R. (2018). High-Dimensional Probability: An Introduction with Applications in Data Science. Cambridge University Press.
  31. Wainwright, M. J. (2019). High-Dimensional Statistics: A Non-Asymptotic Viewpoint. Cambridge University Press.
  32. Finite-sample analysis of learning high-dimensional single ReLU neuron. In International Conference on Machine Learning, pages 37919–37951. PMLR.
  33. Zhivotovskiy, N. (2024). Dimension-free bounds for sums of independent matrices and simple tensors via the variational principle. Electronic Journal of Probability, 29:1–28.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com