Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
156 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

On the Laplace Approximation as Model Selection Criterion for Gaussian Processes (2403.09215v1)

Published 14 Mar 2024 in cs.LG, cs.AI, and stat.ML

Abstract: Model selection aims to find the best model in terms of accuracy, interpretability or simplicity, preferably all at once. In this work, we focus on evaluating model performance of Gaussian process models, i.e. finding a metric that provides the best trade-off between all those criteria. While previous work considers metrics like the likelihood, AIC or dynamic nested sampling, they either lack performance or have significant runtime issues, which severely limits applicability. We address these challenges by introducing multiple metrics based on the Laplace approximation, where we overcome a severe inconsistency occuring during naive application of the Laplace approximation. Experiments show that our metrics are comparable in quality to the gold standard dynamic nested sampling without compromising for computational speed. Our model selection criteria allow significantly faster and high quality model selection of Gaussian process models.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (46)
  1. Akaike, H. A new look at the statistical model identification. IEEE transactions on automatic control, 19(6):716–723, 1974.
  2. Latent force models. In Artificial Intelligence and Statistics, pp.  9–16. PMLR, 2009.
  3. 3cs algorithm for efficient Gaussian process model retrieval. In 2020 25th International Conference on Pattern Recognition (ICPR), 2021.
  4. Automated model inference for gaussian processes: An overview of state-of-the-art methods and algorithms. SN Computer Science, 3(4):300, 2022.
  5. Constraining gaussian processes to systems of linear ordinary differential equations. In Oh, A. H., Agarwal, A., Belgrave, D., and Cho, K. (eds.), Advances in Neural Information Processing Systems, 2022.
  6. Bishop, C. M. Pattern recognition and machine learning, volume 4. Springer, 2006.
  7. Matérn gaussian processes on riemannian manifolds. Advances in Neural Information Processing Systems, 33:12426–12437, 2020.
  8. Multimodel inference: understanding aic and bic in model selection. Sociological methods & research, 33(2):261–304, 2004.
  9. Practical use of the information-theoretic approach. Springer, 1998.
  10. NCVX: A general-purpose optimization solver for constrained machine and deep learning. 2022.
  11. Stan: A probabilistic programming language. Journal of statistical software, 76, 2017.
  12. Duvenaud, D. Automatic model construction with Gaussian processes. PhD thesis, University of Cambridge, 2014.
  13. Structure discovery in nonparametric regression through compositional kernel search. In ICML, 2013.
  14. Fast kronecker inference in gaussian processes with non-gaussian likelihoods. In International Conference on Machine Learning, pp. 607–616. PMLR, 2015.
  15. A BFGS-SQP method for nonsmooth, nonconvex, constrained optimization and its evaluation using relative minimization profiles. Optimization Methods and Software, 32(1):148–181, 2017.
  16. Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. In Advances in Neural Information Processing Systems, 2018.
  17. Bayesian and markov chain monte carlo methods for identifying nonlinear systems in the presence of uncertainty. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 373(2051):20140405, 2015.
  18. Gaussian process priors for systems of linear partial differential equations with constant coefficients. ICML, 2023.
  19. Laplace approximation and natural gradient for gaussian process regression with heteroscedastic student-t model. Statistics and Computing, 29(4):753–773, 2019.
  20. Dynamic nested sampling: an improved algorithm for parameter estimation and evidence calculation. Statistics and Computing, 29:891–913, 2019.
  21. Equivariant learning of stochastic fields: Gaussian processes and steerable conditional neural processes. In ICML, 2021.
  22. Automated kernel search for gaussian processes on data streams. In 2021 IEEE International Conference on Big Data (Big Data), pp.  3584–3588. IEEE, 2021.
  23. Dynamically self-adjusting gaussian processes for data stream modelling. In KI 2022: Advances in Artificial Intelligence: 45th German Conference on AI, Trier, Germany, September 19–23, 2022, Proceedings, pp. 96–114. Springer, 2022.
  24. Jaynes, E. T. Probability theory: The logic of science. Cambridge university press, 2003.
  25. Linearly constrained Gaussian processes. In NeurIPS 2017, 2017.
  26. Maximum likelihood estimation in gaussian process regression is ill-posed. Journal of Machine Learning Research, 24(120):1–47, 2023.
  27. Atmospheric co2 records from sites in the sio air sampling network. Trends, 93:16–26.
  28. Scaling up the automatic statistician: Scalable structure discovery using gaussian processes. In International Conference on Artificial Intelligence and Statistics, pp.  575–584. PMLR, 2018.
  29. joshspeagle/dynesty: v2.1.2, June 2023. URL https://doi.org/10.5281/zenodo.7995596.
  30. Learnable uncertainty under laplace approximations. In Uncertainty in Artificial Intelligence, pp.  344–353. PMLR, 2021.
  31. Assessing approximations for gaussian process classification. Advances in neural information processing systems, 18, 2005.
  32. Sparse spectrum gaussian process regression. JMLR, 2010.
  33. Implicit kernel learning. In The 22nd International Conference on Artificial Intelligence and Statistics, pp.  2007–2016. PMLR, 2019.
  34. Improving hyperparameter learning under approximate inference in gaussian process models. ICML, 2023.
  35. Automatic construction and natural-language description of nonparametric regression models. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 28, 2014.
  36. Murphy, K. P. Machine learning: a probabilistic perspective. MIT press, 2012.
  37. Murphy, K. P. Probabilistic Machine Learning: An introduction. MIT Press, 2022. URL probml.ai.
  38. Numerical optimization. Springer, 1999.
  39. Gaussian processes for machine learning, volume 2. MIT press Cambridge, MA, 2006.
  40. Online structured laplace approximations for overcoming catastrophic forgetting. Advances in Neural Information Processing Systems, 31, 2018.
  41. Schwarz, G. Estimating the dimension of a model. The annals of statistics, pp.  461–464, 1978.
  42. Marginalised gaussian processes with nested sampling. Advances in neural information processing systems, 34:13613–13625, 2021.
  43. Skilling, J. Nested sampling for general bayesian computation. 2006.
  44. Speagle, J. S. DYNESTY: a dynamic nested sampling package for estimating Bayesian posteriors and evidences. Monthly Notices of the Royal Astronomical Society, 493(3):3132–3158, April 2020. doi: 10.1093/mnras/staa278.
  45. Gaussian process kernels for pattern discovery and extrapolation. In International conference on machine learning, pp. 1067–1075. PMLR, 2013.
  46. Vecchia–laplace approximations of generalized gaussian processes for big non-gaussian spatial data. Computational Statistics & Data Analysis, 153:107081, 2021.

Summary

We haven't generated a summary for this paper yet.