Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
184 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Warm Start Marginal Likelihood Optimisation for Iterative Gaussian Processes (2405.18328v1)

Published 28 May 2024 in cs.LG and stat.ML

Abstract: Gaussian processes are a versatile probabilistic machine learning model whose effectiveness often depends on good hyperparameters, which are typically learned by maximising the marginal likelihood. In this work, we consider iterative methods, which use iterative linear system solvers to approximate marginal likelihood gradients up to a specified numerical precision, allowing a trade-off between compute time and accuracy of a solution. We introduce a three-level hierarchy of marginal likelihood optimisation for iterative Gaussian processes, and identify that the computational costs are dominated by solving sequential batches of large positive-definite systems of linear equations. We then propose to amortise computations by reusing solutions of linear system solvers as initialisations in the next step, providing a $\textit{warm start}$. Finally, we discuss the necessary conditions and quantify the consequences of warm starts and demonstrate their effectiveness on regression tasks, where warm starts achieve the same results as the conventional procedure while providing up to a $16 \times$ average speed-up among datasets.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (22)
  1. Sampling-based inference for large linear models, with application to linearised Laplace. In International Conference on Learning Representations, 2023.
  2. Tighter bounds on the log marginal likelihood of gaussian process regression using conjugate gradients. In International Conference on Machine Learning, 2021.
  3. JAX: composable transformations of Python+NumPy programs, 2018.
  4. Gaussian Processes for Data-Efficient Learning in Robotics and Control. IEEE Trans. Pattern Anal. Mach. Intell., 2015.
  5. UCI Machine Learning Repository, 2017.
  6. XTrace: Making the Most of Every Sample in Stochastic Trace Estimation. Matrix Analysis and Applications, 45(1), 2024.
  7. GPyTorch: Blackbox Matrix-matrix Gaussian Process Inference with GPU Acceleration. In Advances in Neural Information Processing Systems, 2018.
  8. Gaussian Processes for Big Data. In Uncertainty in Artificial Intelligence, 2013.
  9. M.F. Hutchinson. A stochastic estimator of the trace of the influence matrix for Laplacian smoothing splines. Communications in Statistics - Simulation and Computation, 19(2), 1990.
  10. Adam: A Method for Stochastic Optimization. In International Conference on Learning Representations, 2015.
  11. Sampling from Gaussian Process Posteriors using Stochastic Gradient Descent. In Preprint, arXiv:2306.11589, 2023.
  12. Stochastic Gradient Descent for Gaussian Processes Done Right. In International Conference on Learning Representations, 2024.
  13. Hutch++: Optimal Stochastic Trace Estimation. Symposium on Simplicity in Algorithms, 2021, 2021.
  14. A Unifying View of Sparse Approximate Gaussian Process Regression. Journal of Machine Learning Research, 6, 2005.
  15. Gaussian Processes for Machine Learning. MIT Press, 2006.
  16. Practical Bayesian Optimization of Machine Learning Algorithms. In Advances in Neural Information Processing Systems, 2012.
  17. Michalis K Titsias. Variational learning of inducing variables in sparse Gaussian processes. In Artificial Intelligence and Statistics, 2009.
  18. Roman Vershynin. Introduction to the non-asymptotic analysis of random matrices. In Compressed Sensing: Theory and Applications, 2012.
  19. Exact Gaussian Processes on a Million Data Points. In Advances in Neural Information Processing Systems, 2019.
  20. Efficiently Sampling Functions from Gaussian Process Posteriors. In International Conference on Machine Learning, 2020.
  21. Pathwise Conditioning of Gaussian Processes. Journal of Machine Learning Research, 22, 2021.
  22. Large-Scale Gaussian Processes via Alternating Projection. In International Conference on Artificial Intelligence and Statistics, 2024.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets