Thin and Deep Gaussian Processes (2310.11527v1)
Abstract: Gaussian processes (GPs) can provide a principled approach to uncertainty quantification with easy-to-interpret kernel hyperparameters, such as the lengthscale, which controls the correlation distance of function values. However, selecting an appropriate kernel can be challenging. Deep GPs avoid manual kernel engineering by successively parameterizing kernels with GP layers, allowing them to learn low-dimensional embeddings of the inputs that explain the output data. Following the architecture of deep neural networks, the most common deep GPs warp the input space layer-by-layer but lose all the interpretability of shallow GPs. An alternative construction is to successively parameterize the lengthscale of a kernel, improving the interpretability but ultimately giving away the notion of learning lower-dimensional embeddings. Unfortunately, both methods are susceptible to particular pathologies which may hinder fitting and limit their interpretability. This work proposes a novel synthesis of both previous approaches: Thin and Deep GP (TDGP). Each TDGP layer defines locally linear transformations of the original input data maintaining the concept of latent embeddings while also retaining the interpretation of lengthscales of a kernel. Moreover, unlike the prior solutions, TDGP induces non-pathological manifolds that admit learning lower-dimensional representations. We show with theoretical and experimental results that i) TDGP is, unlike previous models, tailored to specifically discover lower-dimensional manifolds in the input data, ii) TDGP behaves well when increasing the number of layers, and iii) TDGP performs well in standard benchmark datasets.
- “Manifold Gaussian processes for regression” In International Joint Conference on Neural Networks (IJCNN), 2016
- François Chollet “Keras”, https://keras.io, 2015
- Andreas C. Damianou and Neil D. Lawrence “Deep Gaussian processes” In Artificial Intelligence and Statistics (AISTATS), 2013
- Andreas C. Damianou, Michalis K. Titsias and Neil D. Lawrence “Variational inference for latent variables and uncertain inputs in Gaussian processes” In Journal of Machine Learning Research (JMLR) 17, 2016
- Marc Peter Deisenroth and Carl Edward Rasmussen “PILCO: A Model-Based and Data-Efficient Approach to Policy Search” In International Conference on Machine Learning (ICML), 2011
- Peter J. Diggle and Paulo J. Ribeiro “Gaussian models for geostatistical data” In Model-based Geostatistics Springer New York, 2007, pp. 46–78
- “How deep are deep Gaussian processes?” In Journal of Machine Learning Research (JMLR) 19, 2018
- “GPflux: A library for Deep Gaussian Processes”, 2021 arXiv:2104.05674
- “Avoiding pathologies in very deep networks” In Artificial Intelligence and Statistics (AISTATS), 2014
- “GPflow: A Gaussian Process Library using TensorFlow” In Journal of Machine Learning Research (JLMR) 18, 2017
- Roman Garnett “Bayesian Optimization” Cambridge University Press, 2023
- Mark N. Gibbs “Bayesian Gaussian processes for regression and classification”, 1997
- “Non-Stationary Gaussian Process Regression with Hamiltonian Monte Carlo” In Artificial Intelligence and Statistics (AISTATS), 2016
- Dave Higdon, Jenise Swall and John Kern “Non-Stationary Spatial Modeling” In Bayesian Statistics, 1999
- Neil D. Lawrence and Andrew J. Moore “Hierarchical Gaussian process latent variable models” In international Conference on Machine learning (ICML), 2007
- Miguel Lázaro-Gredilla “Bayesian Warped Gaussian Processes” In Advances in Neural Information Processing Systems (NeurIPS), 2012
- Sebastian W. Ober, Carl Edward Rasmussen and Mark Wilk “The promises and pitfalls of deep kernel learning” In Uncertainty in Artificial Intelligence (UAI), 2021
- Christopher J. Paciorek “Nonstationary Gaussian Processes for Regression and Spatial Modelling”, 2003
- Christopher J. Paciorek and Mark J. Schervish “Nonstationary Covariance Functions for Gaussian Process Regression” In Advances in Neural Information Processing Systems (NeurIPS), 2003
- Carl Edward Rasmussen and Christopher K.I. Williams “Gaussian Processes for Machine Learning” MIT Press, 2006
- Hugh Salimbeni “Deep Gaussian Processes: Advances in Models and Inference”, 2019
- Hugh Salimbeni and Marc Peter Deisenroth “Deeply non-stationary Gaussian processes” In 2nd Workshop on Bayesian Deep Learning (NeurIPS), 2017
- Hugh Salimbeni and Marc Peter Deisenroth “Doubly Stochastic Variational Inference for Deep Gaussian Processes” In Advances in Neural Information Processing Systems (NeurIPS), 2017
- Michalis K. Titsias and Neil D. Lawrence “Bayesian Gaussian Process Latent Variable Model” In Artificial Intelligence and Statistics (AISTATS), 2010
- Michalis K. Titsias and Miguel Lázaro-Gredilla “Variational Inference for Mahalanobis Distance Metrics in Gaussian Process Regression” In Advances in Neural Information Processing Systems (NeurIPS), 2013
- “Deep Kernel Learning” In Artificial Intelligence and Statistics (AISTATS), 2016
- “Deep Compositional Spatial Models” In Journal of the American Statistical Association 117.540, 2022
- Michalis K. Titsias “Variational Learning of Inducing Variables in Sparse Gaussian Processes” In Artificial Intelligence and Statistics (AISTATS), 2009