Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
91 tokens/sec
GPT-4o
12 tokens/sec
Gemini 2.5 Pro Pro
o3 Pro
5 tokens/sec
GPT-4.1 Pro
15 tokens/sec
DeepSeek R1 via Azure Pro
33 tokens/sec
Gemini 2.5 Flash Deprecated
12 tokens/sec
2000 character limit reached

Consistent information criteria for regularized regression and loss-based learning problems (2404.17181v1)

Published 26 Apr 2024 in stat.ME

Abstract: Many problems in statistics and machine learning can be formulated as model selection problems, where the goal is to choose an optimal parsimonious model among a set of candidate models. It is typical to conduct model selection by penalizing the objective function via information criteria (IC), as with the pioneering work by Akaike and Schwarz. Via recent work, we propose a generalized IC framework to consistently estimate general loss-based learning problems. In this work, we propose a consistent estimation method for Generalized Linear Model (GLM) regressions by utilizing the recent IC developments. We advance the generalized IC framework by proposing model selection problems, where the model set consists of a potentially uncountable set of models. In addition to theoretical expositions, our proposal introduces a computational procedure for the implementation of our methods in the finite sample setting, which we demonstrate via an extensive simulation study.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (36)
  1. H. Akaike. A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19(6):716–723, December 1974. ISSN 0018-9286.
  2. Infinite dimensional analysis: a hitchhiker’s guide. Springer, Berlin ; New York, 3rd [rev. and enl.] ed edition, 2006. ISBN 9783540295860. OCLC: ocm69983226.
  3. Takeshi Amemiya. Advanced Econometrics. Harvard University Press, Cambridge, 1985.
  4. Jean-Patrick Baudry. Estimation and model selection for model-based clustering with the conditional classification likelihood. Electronic Journal of Statistics, 9:1041–1077, 2015.
  5. Perturbation analysis of optimization problems. Springer New York, New York, NY, 2000. ISBN 9781461271291 9781461213949.
  6. Convex optimization. Cambridge University Press, Cambridge, UK ; New York, 2004. ISBN 9780521833783.
  7. Peter Bühlmann and Sara Van de Geer. Statistics for high-dimensional data: methods, theory and applications. Springer Series in Statistics. Springer Berlin Heidelberg, Berlin, Heidelberg, 2011. ISBN 9783642201912 9783642201929.
  8. Model selection and model averaging. Cambridge University Press, 1 edition, January 2001. ISBN 9780521852258 9780511790485.
  9. James Davidson. Stochastic limit theory: an introduction for econometricians. Advanced texts in econometrics. Oxford University Press, Oxford, second edition edition, 2021. ISBN 9780192844507. OCLC: on1272885940.
  10. Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96(456):1348–1360, 2001. ISSN 01621459.
  11. Xin Gao and Peter X.-K. Song. Composite likelihood bayesian information criteria for model selection in high-dimensional data. Journal of the American Statistical Association, 105(492):1531–1540, 2010. ISSN 0162-1459.
  12. Statistical learning with sparsity: the lasso and generalizations. Number 143 in Monographs on statistics and applied probability. CRC Press, Taylor & Francis Group, Boca Raton, 2015. ISBN 9781498712163.
  13. Ridge regression: biased estimation for nonorthogonal problems. Technometrics, 12(1):55–67, February 1970. ISSN 0040-1706, 1537-2723.
  14. Francis K C Hui. On the use of a penalized quasilikelihood information criterion for generalized linear mixed models. Biometrika, 108(2):353–365, May 2021. ISSN 0006-3444, 1464-3510.
  15. Order selection in finite mixture models: complete or observed likelihood information criteria? Biometrika, 102(3):724–730, September 2015. ISSN 0006-3444, 1464-3510.
  16. On model selection consistency of the elastic net when p ¿¿ n. Statistica Sinica, 20(2):595–611, 2010. ISSN 1017-0405.
  17. lpp{}_{\textrm{p}}start_FLOATSUBSCRIPT p end_FLOATSUBSCRIPT-norm multiple kernel learning. Journal of Machine Learning Research, 12(26):953–997, 2011. ISSN 1533-7928.
  18. Brian G. Leroux. Consistent estimation of a mixing distribution. The Annals of Statistics, 20(3):1350–1360, 1992. ISSN 0090-5364.
  19. The Lasso as an l1-ball model selection procedure. Electronic Journal of Statistics, 5:669 – 687, 2011.
  20. Allan D R McQuarrie and Chih-Ling Tsai. Regression and time series model selection. WORLD SCIENTIFIC, May 1998. ISBN 9789810232429 9789812385451.
  21. Model comparison with composite likelihood information criteria. Bernoulli, 20(4):1738–1764, 2014. ISSN 1350-7265.
  22. Hien Duy Nguyen. PanIC: consistent information criteria for general model selection problems. 2023.
  23. Tikhonov, Ivanov and Morozov regularization for support vector machine learning. Machine Learning, 103(1):103–136, April 2016. ISSN 0885-6125, 1573-0565.
  24. Ralph Tyrrell Rockafellar. Convex analysis. Princeton Landmarks in mathematics and physics. Princeton Univ. Press, Princeton, NJ, 10. print. and 1. paperb. print edition, 1997. ISBN 9780691015866 9780691080697.
  25. Gideon Schwarz. Estimating the dimension of a model. The Annals of Statistics, 6(2), March 1978. ISSN 0090-5364.
  26. Lectures on stochastic programming: modeling and theory, third edition. Society for Industrial and Applied Mathematics, Philadelphia, PA, July 2021. ISBN 9781611976588 9781611976595.
  27. Information criteria for selecting possibly misspecified parametric models. Journal of Econometrics, 71(1-2):207–225, March 1996. ISSN 03044076.
  28. M. Stone. Cross-validatory choice and assessment of statistical predictions. Journal of the Royal Statistical Society. Series B (Methodological), 36(2):111–147, 1974. ISSN 0035-9246.
  29. Robert Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), 58(1):267–288, 1996. ISSN 0035-9246.
  30. Vladimir N. Vapnik. The nature of statistical learning theory. Springer New York, New York, NY, 2000. ISBN 9781441931603 9781475732641.
  31. Vladimir Naumovich Vapnik. Statistical learning theory. Adaptive and learning systems for signal processing, communications, and control. Wiley, New York, 1998. ISBN 9780471030034.
  32. A note on composite likelihood inference and model selection. Biometrika, 92(3):519–528, 2005. ISSN 0006-3444.
  33. Ming Yuan and Yi Lin. On the non-negative garrotte estimator. Journal of the Royal Statistical Society. Series B (Statistical Methodology), 69(2):143–161, 2007. ISSN 1369-7412.
  34. On model selection consistency of lasso. Journal of Machine Learning Research, 7(90):2541–2563, 2006. ISSN 1533-7928.
  35. Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society. Series B (Statistical Methodology), 67(2):301–320, 2005. ISSN 1369-7412.
  36. On the “degrees of freedom” of the lasso. The Annals of Statistics, 35(5), October 2007. ISSN 0090-5364.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com