Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Stagewise Boosting Distributional Regression (2405.18288v1)

Published 28 May 2024 in stat.ME and stat.ML

Abstract: Forward stagewise regression is a simple algorithm that can be used to estimate regularized models. The updating rule adds a small constant to a regression coefficient in each iteration, such that the underlying optimization problem is solved slowly with small improvements. This is similar to gradient boosting, with the essential difference that the step size is determined by the product of the gradient and a step length parameter in the latter algorithm. One often overlooked challenge in gradient boosting for distributional regression is the issue of a vanishing small gradient, which practically halts the algorithm's progress. We show that gradient boosting in this case oftentimes results in suboptimal models, especially for complex problems certain distributional parameters are never updated due to the vanishing gradient. Therefore, we propose a stagewise boosting-type algorithm for distributional regression, combining stagewise regression ideas with gradient boosting. Additionally, we extend it with a novel regularization method, correlation filtering, to provide additional stability when the problem involves a large number of covariates. Furthermore, the algorithm includes best-subset selection for parameters and can be applied to big data problems by leveraging stochastic approximations of the updating steps. Besides the advantage of processing large datasets, the stochastic nature of the approximations can lead to better results, especially for complex distributions, by reducing the risk of being trapped in a local optimum. The performance of our proposed stagewise boosting distributional regression approach is investigated in an extensive simulation study and by estimating a full probabilistic model for lightning counts with data of more than 9.1 million observations and 672 covariates.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (43)
  1. “On the Difficulty of Training Recurrent Neural Networks.” Technical report, Department d’Informatique et de Recherche Opérationnelle, Université de Montréal.
  2. “Gridded Lightning Climatology from TRMM-LIS and OTD: Dataset Description.” Atmospheric Research, 135, 404–414. 10.1016/j.atmosres.2012.06.028.
  3. Copernicus Climate Change Service (2017). “ERA5: Fifth Generation of ECMWF Atmospheric Reanalyses of the Global Climate.” Copernicus Climate Change Service Climate Date Store (CDS). Date of access: June 2019, https://cds.climate.copernicus.eu/cdsapp#!/home.
  4. Dunn PK, Smyth GK (1996). “Randomized Quantile Residuals.” Journal of Computational and Graphical Statistics, 5(3), 236–244. 10.2307/1390802.
  5. “Using Cloud Ice Flux to Parametrise Large-Scale Lightning.” Atmospheric Chemistry and Physics, 14(23), 12665–12682. 10.5194/acp-14-12665-2014.
  6. Gamerman D (1997). “Sampling from the Posterior Distribution in Generalized Linear Mixed Models.” Statistics and Computing, 7(1), 57–68. 10.1023/a:1018509429360.
  7. Gneiting T, Raftery AE (2007). “Strictly Proper Scoring Rules, Prediction, and Estimation.” Journal of the American Statistical Association, 102(477), 359–378. 10.1198/016214506000001437.
  8. “LASSO-Type Penalization in the Framework of Generalized Additive Models for Location, Scale and Shape.” Computational Statistics & Data Analysis, 140, 59–74. 10.1016/j.csda.2019.06.005.
  9. The Elements of Statistical Learning. 2nd edition. Springer-Verlag, New York. 10.1007/978-0-387-84858-7.
  10. Hersbach H, et al (2020). “The ERA5 Global Reanalysis.” Quarterly Journal of the Royal Meteorological Society, 146(730), 1999–2049. 10.1002/qj.3803.
  11. gamboostLSS: Boosting Methods for GAMLSS Models. R package version 2.0-7, URL https://CRAN.R-project.org/package=gamboostLSS.
  12. “gamboostLSS: An R Package for Model Building and Variable Selection in the GAMLSS Framework.” Journal of Statistical Software, 74(1), 1–31. 10.18637/jss.v074.i01.
  13. International Institute for Population Sciences (IIPS) and ORC Macro (2000). “National Family Health Survey (NFHS-2), 1998–99 [Datasets]. IAKR42.DTA.”
  14. King G, Zeng L (2001). “Logistic Regression in Rare Events Data.” Political Analysis, 9, 137–163.
  15. “Bayesian Structured Additive Distributional Regression for Multivariate Responses.” Journal of the Royal Statistical Society C, 64, 569–591. 10.1111/rssc.12090.
  16. “Bayesian Generalized Additive Models for Location, Scale and Shape for Zero-Inflated and Overdispersed Count Data.” Journal of the American Statistical Association, 110(509), 405–419. 10.1080/01621459.2014.912955.
  17. “Climate Change 2021: The Physical Science Basis.” Contribution of working group I to the sixth assessment report of the intergovernmental panel on climate change, pp. 3–32.
  18. Meinshausen N, Bühlmann P (2010). “Stability Selection.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 72(4), 417–473. https://doi.org/10.1111/j.1467-9868.2010.00740.x.
  19. Murray LT (2018). “An Uncertain Future for Lightning.” Nature Climate Change, 8(3), 191–192. 10.1038/s41558-018-0094-0.
  20. Price C, Rind D (1992). “A Simple Lightning Parameterization for Calculating Global Lightning Distributions.” Journal of Geophysical Research: Atmospheres, 97(D9), 9919–9933. 10.1029/92JD00719.
  21. R Core Team (2020). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/.
  22. Rigby RA, Stasinopoulos DM (2005). “Generalized Additive Models for Location, Scale and Shape.” Journal of the Royal Statistical Society C, 54(3), 507–554. 10.1111/j.1467-9876.2005.00510.x.
  23. “Cloud-to-Ground Lightning in Austria: A 10-Year Study Using Data from a Lightning Location System.” Journal of Geophysical Research: Atmospheres, 110(D9). 10.1029/2004JD005332.
  24. Schumann U, Huntrieser H (2007). “The Global Lightning-Induced Nitrogen Oxides Source.” Atmospheric Chemistry and Physics, 7(14), 3823–3907. 10.5194/acp-7-3823-2007.
  25. “Probabilistic Forecasting of Thunderstorms in the Eastern Alps.” Monthly Weather Review, 146, 2999–3009. 10.1175/MWR-D-17-0366.1.
  26. “Amplification of Annual and Diurnal Cycles of Alpine Lightning.” Climate Dynamics. 10.1007/s00382-023-06786-8.
  27. “NWP-Based Lightning Prediction Using Flexible Count Data Regression.” Advances in Statistical Climatology, Meteorology and Oceanography, 5(1), 1–16. 10.5194/ascmo-5-1-2019.
  28. Stasinopoulos DM, Rigby RA (2022). \pkggamlss.dist: Distributions for Generalized Additive Models for Location, Scale and Shape. \proglangR package version 6.1-1, URL https://CRAN.R-project.org/package=gamlss.dist.
  29. “Deselection of Base-Learners for Statistical Boosting—with an Application to Distributional Regression.” Statistical Methods in Medical Research, 31(2), 207–224. 10.1177/09622802211051088.
  30. “Differing Trends in United States and European Severe Thunderstorm Environments in a Warming Climate.” Bulletin of the American Meteorological Society, 102(2), 296–322. 10.1175/BAMS-D-20-0004.1.
  31. “Probing for Sparse and Fast Variable Selection with Model-Based Boosting.” Computational and Mathematical Methods in Medicine, 2017, 1421409. ISSN 1748-670X. 10.1155/2017/1421409.
  32. “Gradient Boosting for Distributional Regression: Faster Tuning and Improved Variable Selection via Noncyclical Updates.” Statistics and Computing, 28, 673–687. /10.1007/s11222-017-9754-6.
  33. Tibshirani RJ (2015). “A General Framework for Fast Stagewise Algorithms.” Journal of Machine Learning Research, 16(78), 2543–2588. URL http://jmlr.org/papers/v16/tibshirani15a.html.
  34. Ukkonen P, Mäkelä A (2019). “Evaluation of Machine Learning Classifiers for Predicting Deep Convection.” Journal of Advances in Modeling Earth Systems (JAMES), 11(6), 1784–1802. 10.1029/2018MS001561.
  35. “\pkgbamlss: A Lego Toolbox for Flexible Bayesian Regression (and Beyond).” Journal of Statistical Software, 100(4), 1–53. 10.18637/jss.v100.i04.
  36. “BAMLSS: Bayesian Additive Models for Location, Scale, and Shape (and Beyond).” Journal of Computational and Graphical Statistics, 27(3), 612–627. 10.1080/10618600.2017.1407325.
  37. \pkgbamlss: Bayesian Additive Models for Location Scale and Shape (and Beyond). \proglangR package version 1.2-4, URL http://CRAN.R-project.org/package=bamlss.
  38. Umlauf N, Kneib T (2018). “A Primer on Bayesian Distributional Regression.” Statistical Modelling, 18(3-4), 219–247. 10.1177/1471082X18759140.
  39. “Scalable Estimation for Structured Additive Distributional Regression.” 10.48550/ARXIV.2301.05593.
  40. “Variable importance analysis: A comprehensive review.” Reliability Engineering & System Safety, 142, 399–432. ISSN 0951-8320. https://doi.org/10.1016/j.ress.2015.05.018.
  41. Yair Y (2018). “Lightning hazards to human societies in a changing climate.” Environmental Research Letters, 13(12), 123002. 10.1088/1748-9326/aaea86.
  42. “Adaptive Step-Length Selection in Gradient Boosting for Gaussian Location Scale Models.” Computational Statistics, 37, 2295–2332. /10.1007/s00180-022-01199-3.
  43. “On the "Degrees of Freedom" of the Lasso.” The Annals of Statistics, 35(5), 2173 – 2192. 10.1214/009053607000000127.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets