Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
89 tokens/sec
GPT-4o
11 tokens/sec
Gemini 2.5 Pro Pro
50 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
3 tokens/sec
DeepSeek R1 via Azure Pro
33 tokens/sec
2000 character limit reached

Parallel Gaussian process with kernel approximation in CUDA (2403.12797v1)

Published 19 Mar 2024 in cs.DC

Abstract: This paper introduces a parallel implementation in CUDA/C++ of the Gaussian process with a decomposed kernel. This recent formulation, introduced by Joukov and Kuli\'c (2022), is characterized by an approximated -- but much smaller -- matrix to be inverted compared to plain Gaussian process. However, it exhibits a limitation when dealing with higher-dimensional samples which degrades execution times. The solution presented in this paper relies on parallelizing the computation of the predictive posterior statistics on a GPU using CUDA and its libraries. The CPU code and GPU code are then benchmarked on different CPU-GPU configurations to show the benefits of the parallel implementation on GPU over the CPU.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (13)
  1. Stable evaluation of gaussian radial basis function interpolants. SIAM Journal on Scientific Computing 34, A737–A762.
  2. Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. arXiv:1809.11165.
  3. GPy, since 2012. GPy: A gaussian process framework in python. http://github.com/SheffieldML/GPy.
  4. Fast approximate multioutput gaussian processes. IEEE Intelligent Systems 37, 56–69. doi:10.1109/MIS.2022.3169036.
  5. On sparse variational methods and the kullback-leibler divergence between stochastic processes, in: Gretton, A., Robert, C.C. (Eds.), Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, PMLR, Cadiz, Spain. pp. 231–239. URL: https://proceedings.mlr.press/v51/matthews16.html.
  6. Eigengp: Gaussian process models with adaptive eigenfunctions, in: Proceedings of the 24th International Conference on Artificial Intelligence, AAAI Press. p. 3763–3769.
  7. Gpjax: A gaussian process framework in jax. Journal of Open Source Software 7, 4455. URL: https://doi.org/10.21105/joss.04455, doi:10.21105/joss.04455.
  8. A unifying view of sparse approximate gaussian process regression. The Journal of Machine Learning Research 6, 1939–1959.
  9. Sparse gaussian processes using pseudo-inputs. Advances in neural information processing systems 18.
  10. Mercer’s Theorem on General Domains: On the Interaction between Measures, Kernels, and RKHSs. Constructive Approximation 35, 363–417. URL: https://doi.org/10.1007/s00365-012-9153-3, doi:10.1007/s00365-012-9153-3.
  11. Variational learning of inducing variables in sparse gaussian processes, in: Artificial intelligence and statistics, PMLR. pp. 567–574.
  12. A framework for interdomain and multioutput Gaussian processes. arXiv:2003.01115 URL: https://arxiv.org/abs/2003.01115.
  13. Using the nyström method to speed up kernel machines, in: Proceedings of the 13th International Conference on Neural Information Processing Systems, MIT Press, Cambridge, MA, USA. p. 661–667.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com