Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A sparse PAC-Bayesian approach for high-dimensional quantile prediction (2409.01687v1)

Published 3 Sep 2024 in stat.ML, cs.LG, math.ST, and stat.TH

Abstract: Quantile regression, a robust method for estimating conditional quantiles, has advanced significantly in fields such as econometrics, statistics, and machine learning. In high-dimensional settings, where the number of covariates exceeds sample size, penalized methods like lasso have been developed to address sparsity challenges. Bayesian methods, initially connected to quantile regression via the asymmetric Laplace likelihood, have also evolved, though issues with posterior variance have led to new approaches, including pseudo/score likelihoods. This paper presents a novel probabilistic machine learning approach for high-dimensional quantile prediction. It uses a pseudo-Bayesian framework with a scaled Student-t prior and Langevin Monte Carlo for efficient computation. The method demonstrates strong theoretical guarantees, through PAC-Bayes bounds, that establish non-asymptotic oracle inequalities, showing minimax-optimal prediction error and adaptability to unknown sparsity. Its effectiveness is validated through simulations and real-world data, where it performs competitively against established frequentist and Bayesian techniques.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (72)
  1. On optimality of bayesian testimation in the normal means problem. Annals of Statistics, 35(5):2261.
  2. Alquier, P. (2024). User-friendly introduction to PAC-Bayes bounds. Foundations and Trends® in Machine Learning, 17(2):174–303.
  3. Estimation bounds and sharp oracle inequalities of regularized procedures with Lipschitz loss functions. The Annals of Statistics, 47(4):2117 – 2144.
  4. On the properties of variational approximations of gibbs posteriors. Journal of Machine Learning Research, 17(236):1–41.
  5. Slope meets lasso: improved oracle bounds and optimality. The Annals of Statistics, 46(6B):3603–3642.
  6. ℓℓ\ellroman_ℓ-1-penalized quantile regression in high-dimensional sparse models. The Annals of Statistics, 39(1):82–130.
  7. A general framework for updating belief distributions. Journal of the Royal Statistical Society Series B: Statistical Methodology, 78(5):1103–1130.
  8. Optimal minimal neural interpretation of spectra. Analytical chemistry, 64(5):545–551.
  9. Statistics for high-dimensional data: Methods, theory and applications. Springer Series in Statistics. Springer, Heidelberg.
  10. The horseshoe estimator for sparse signals. Biometrika, 97(2):465–480.
  11. Empirical Bayes analysis of spike and slab posterior distributions. Electronic Journal of Statistics, 12:3953–4001.
  12. Bayesian linear regression with sparse priors. Annals of Statistics, 43(5):1986–2018.
  13. Needles and straw in a haystack: Posterior concentration for possibly sparse sequences. The Annals of Statistics, 40(4):2069–2101.
  14. Catoni, O. (2004). Statistical learning theory and stochastic optimization, volume 1851 of Saint-Flour Summer School on Probability Theory 2001 (Jean Picard ed.), Lecture Notes in Mathematics. Springer-Verlag, Berlin.
  15. Catoni, O. (2007). PAC-Bayesian supervised classification: the thermodynamics of statistical learning. IMS Lecture Notes—Monograph Series, 56. Institute of Mathematical Statistics, Beachwood, OH.
  16. Sparse quantile regression. Journal of Econometrics, 235(2):2195–2217.
  17. Homozygosity mapping with snp arrays identifies trim32, an e3 ubiquitin ligase, as a bardet–biedl syndrome gene (bbs11). Proceedings of the National Academy of Sciences, 103(16):6287–6292.
  18. Aggregation by exponential weighting, sharp PAC-Bayesian bounds and sparsity. Machine Learning, 72(1-2):39–61.
  19. Dalalyan, A. S. (2017). Theoretical guarantees for approximate sampling from smooth and log-concave densities. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 3(79):651–676.
  20. Mirror averaging with sparsity priors. Bernoulli, 18(3):914–944.
  21. Sparse regression learning by aggregation and langevin monte-carlo. Journal of Computer and System Sciences, 78(5):1423–1443.
  22. Nonasymptotic convergence analysis for the unadjusted langevin algorithm. The Annals of Applied Probability, 27(3):1551–1587.
  23. High-dimensional Bayesian inference via the unadjusted langevin algorithm. Bernoulli, 25(4A):2854–2882.
  24. Robust low-rank matrix estimation. The Annals of Statistics, 46(6B):3481–3509.
  25. Faraway, J. (2016). faraway: Functions and datasets for books by julian faraway. R package version, 1(7).
  26. Risk bounds for the majority vote: From a pac-bayesian analysis to a learning algorithm. Journal of Machine Learning Research, 16(26):787–860.
  27. Giraud, C. (2022). Introduction to high-dimensional statistics, volume 168 of Monographs on Statistics and Applied Probability. CRC Press, Boca Raton, FL, second edition.
  28. Inconsistency of Bayesian inference for misspecified linear models, and a proposal for repairing it. Bayesian Analysis, 12(4):1069–1103.
  29. Guedj, B. (2019). A primer on PAC-Bayesian learning. In SMF 2018: Congrès de la Société Mathématique de France, volume 33 of Sémin. Congr., pages 391–413. Soc. Math. France.
  30. A pac-bayesian margin bound for linear classifiers. IEEE Transactions on Information Theory, 48(12):3140–3150.
  31. Variable selection in nonparametric additive models. Annals of statistics, 38(4):2282.
  32. General bayesian loss function selection and the use of improper models. Journal of the Royal Statistical Society Series B: Statistical Methodology, 84(5):1640–1665.
  33. Jiang, W. (2007). Bayesian variable selection for high dimensional generalized linear models: Convergence rates of the fitted densities. The Annals of Statistics, 35(4):1487–1511.
  34. Needles and straw in haystacks: Empirical Bayes estimates of possibly sparse sequences. Ann. Statist., 32(1):1594–1649.
  35. Koenker, R. (2005). Quantile regression, volume 38. Cambridge university press.
  36. Koenker, R. (2017). Quantile regression: 40 years on. Annual review of economics, 9(1):155–176.
  37. Regression quantiles. Econometrica, 46(1):33–50.
  38. Horseshoe prior bayesian quantile regression. Journal of the Royal Statistical Society Series C: Applied Statistics, 73(1):193–220.
  39. Bayesian regularized quantile regression. Bayesian Analysis, 5(3):533–556.
  40. Pseudo-bayesian approach for quantile regression inference: Adaptation to sparsity. Statistica Sinica, 34(20):793 – 815.
  41. General bayesian updating and the loss-likelihood bootstrap. Biometrika, 106(2):465–478.
  42. Mai, T. T. (2024). High-dimensional sparse classification using exponential weighting with empirical hinge loss. Statistica Neerlandica, (in press):1–28.
  43. Massart, P. (2007). Concentration inequalities and model selection, volume 1896 of Lecture Notes in Mathematics. Springer, Berlin. Lectures from the 33rd Summer School on Probability Theory held in Saint-Flour, July 6–23, 2003, Edited by Jean Picard.
  44. Robust generalised bayesian inference for intractable likelihoods. Journal of the Royal Statistical Society Series B: Statistical Methodology, 84(3):997–1022.
  45. McAllester, D. (1998). Some PAC-Bayesian theorems. In Proceedings of the Eleventh Annual Conference on Computational Learning Theory, pages 230–234, New York. ACM.
  46. On the robustness to misspecification of α𝛼\alphaitalic_α-posteriors and their variational approximations. Journal of Machine Learning Research, 23(147):1–51.
  47. Mendelson, S. (2008). Obtaining fast error rates in nonconvex situations. Journal of Complexity, 24(3):380–397.
  48. Variational bayes for high-dimensional linear regression with sparse priors. Journal of the American Statistical Association, 117(539):1270–1281.
  49. Rivoirard, V. (2006). Nonlinear estimation over weak besov spaces and minimax bayes method. Bernoulli, 12(4):609–632.
  50. Langevin diffusions and metropolis-hastings algorithms. Methodology and computing in applied probability, 4(4):337–357.
  51. Regulation of gene expression in the mammalian eye and its relevance to eye disease. Proceedings of the National Academy of Sciences, 103(39):14429–14434.
  52. Seeger, M. W. (2008). Bayesian inference and optimal design for the sparse linear model. Journal of Machine Learning Research, 9:759–813.
  53. Pac-bayesian analysis of co-clustering and beyond. Journal of Machine Learning Research, 11(12).
  54. A PAC analysis of a Bayes estimator. In Proceedings of the Tenth Annual Conference on Computational Learning Theory, pages 2–9, New York. ACM.
  55. Sherwood, B. (2016). Variable selection for additive partial linear quantile regression with missing covariates. Journal of Multivariate Analysis, 152:206–223.
  56. Additive nonlinear quantile regression in ultra-high dimension. Journal of Machine Learning Research, 23(63):1–47.
  57. Partially linear additive quantile regression in ultra-high dimension. Annals of Statistics, 44(1):288–317.
  58. Sriram, K. (2015). A sandwich likelihood correction for bayesian quantile regression based on the misspecified asymmetric laplace density. Statistics & Probability Letters, 107:18–26.
  59. Calibrating general posterior credible regions. Biometrika, 106(2):479–486.
  60. Wainwright, M. J. (2019). High-dimensional statistics: A non-asymptotic viewpoint, volume 48 of Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, Cambridge.
  61. Wang, L. (2019). l⁢_⁢1𝑙_1l\_1italic_l _ 1-regularized quantile regression with many regressors under lean assumptions. Retrieved from the University of Minnesota Digital Conservancy.
  62. Quantile regression for analyzing heterogeneity in ultra-high dimension. Journal of the American Statistical Association, 107(497):214–222.
  63. Bayesian multiple quantile regression for linear models using a score likelihood. Bayesian Analysis, 16(3):875–903.
  64. Bayesian quantile regression based on the empirical likelihood with spike and slab priors. Bayesian Analysis, 11(3):821–855.
  65. Bayesian empirical likelihood for quantile regression. The Annals of Statistics, 40(2):1102.
  66. Posterior inference in bayesian quantile regression with asymmetric laplace likelihood. International Statistical Review, 84(3):327–344.
  67. Yi, C. (2017). hqreg: Regularization Paths for Lasso or Elastic-Net Penalized Huber Loss Regression and Quantile Regression. R package version 1.4.
  68. Semismooth newton coordinate descent algorithm for elastic-net penalized huber loss regression and quantile regression. Journal of Computational and Graphical Statistics, 26(3):547–557.
  69. Adaptation of the tuning parameter in general bayesian inference with robust divergence. Statistics and Computing, 33(2):39.
  70. Bayesian quantile regression. Statistics & Probability Letters, 54(4):437–447.
  71. Zhang, T. (2004). Statistical behavior and consistency of classification methods based on convex risk minimization. The Annals of Statistics, 32(1):56–85.
  72. abess: a fast best-subset selection library in python and r. Journal of Machine Learning Research, 23(202):1–7.
Citations (1)

Summary

We haven't generated a summary for this paper yet.