Sparse Variational Student-t Processes (2312.05568v1)
Abstract: The theory of Bayesian learning incorporates the use of Student-t Processes to model heavy-tailed distributions and datasets with outliers. However, despite Student-t Processes having a similar computational complexity as Gaussian Processes, there has been limited emphasis on the sparse representation of this model. This is mainly due to the increased difficulty in modeling and computation compared to previous sparse Gaussian Processes. Our motivation is to address the need for a sparse representation framework that reduces computational complexity, allowing Student-t Processes to be more flexible for real-world datasets. To achieve this, we leverage the conditional distribution of Student-t Processes to introduce sparse inducing points. Bayesian methods and variational inference are then utilized to derive a well-defined lower bound, facilitating more efficient optimization of our model through stochastic gradient descent. We propose two methods for computing the variational lower bound, one utilizing Monte Carlo sampling and the other employing Jensen's inequality to compute the KL regularization term in the loss function. We propose adopting these approaches as viable alternatives to Gaussian processes when the data might contain outliers or exhibit heavy-tailed behavior, and we provide specific recommendations for their applicability. We evaluate the two proposed approaches on various synthetic and real-world datasets from UCI and Kaggle, demonstrating their effectiveness compared to baseline methods in terms of computational complexity and accuracy, as well as their robustness to outliers.
- and others. 1999. On maximum entropy characterization of Pearson’s type II and VII multivariate distributions. Journal of Multivariate Analysis, 71(1): 67–75.
- Andrade, J. A. A. 2023. On the robustness to outliers of the Student-t process. Scandinavian Journal of Statistics, 50(2): 725–749.
- The Elliptical Processes: a New Family of Flexible Stochastic Processes. arXiv preprint arXiv:2003.07201.
- Variational inference: A review for statisticians. Journal of the American statistical Association, 112(518): 859–877.
- Deep convolutional Gaussian processes. In Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2019, Würzburg, Germany, September 16–20, 2019, Proceedings, Part II, 582–597. Springer.
- Multivariate Gaussian and Student-t process regression for multi-output prediction. Neural Computing and Applications, 32: 3005–3028.
- Random feature expansions for deep Gaussian processes. In International Conference on Machine Learning, 884–893. PMLR.
- Gaussian processes for data-efficient learning in robotics and control. IEEE transactions on pattern analysis and machine intelligence, 37(2): 408–423.
- Learning unknown ODE models with Gaussian processes. In International conference on machine learning, 1959–1968. PMLR.
- Scalable variational Gaussian process classification. In Artificial Intelligence and Statistics, 351–360. PMLR.
- Stochastic gradient descent. Deep learning with Python: A hands-on introduction, 113–132.
- Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114.
- Multivariate t-distributions and their applications. Cambridge University Press.
- Inter-domain Gaussian processes for sparse inference using inducing features. Advances in Neural Information Processing Systems, 22.
- Sparse spectrum Gaussian process regression. The Journal of Machine Learning Research, 11: 1865–1881.
- Trust your robots! predictive uncertainty estimation of neural networks with sparse gaussian processes. In Conference on Robot Learning, 1168–1179. PMLR.
- Inducing point allocation for sparse gaussian processes in high-throughput bayesian optimisation. In International Conference on Artificial Intelligence and Statistics, 5213–5230. PMLR.
- Matrix Inversion free variational inference in Conditional Student’s T Processes. In Fourth Symposium on Advances in Approximate Bayesian Inference.
- A unifying view of sparse approximate Gaussian process regression. The Journal of Machine Learning Research, 6: 1939–1959.
- Rasmussen, C. E. 2003. Gaussian processes in machine learning. In Summer school on machine learning, 63–71. Springer.
- Roth, M. 2012. On the multivariate t distribution. Linköping University Electronic Press.
- Doubly stochastic variational inference for deep Gaussian processes. Advances in neural information processing systems, 30.
- Student-t processes as alternatives to Gaussian processes. In Artificial intelligence and statistics, 877–885. PMLR.
- State space methods for efficient inference in Student-t process regression. In Artificial Intelligence and Statistics, 885–893. PMLR.
- Student-t Process Regression with Student-t Likelihood. In IJCAI, 2822–2828.
- Titsias, M. 2009. Variational learning of inducing variables in sparse Gaussian processes. In Artificial intelligence and statistics, 567–574. PMLR.
- Objective priors for the number of degrees of freedom of a multivariate t distribution and the t-copula. Computational Statistics & Data Analysis, 124: 197–219.