GPU-Accelerated Vecchia Approximations of Gaussian Processes for Geospatial Data using Batched Matrix Computations (2403.07412v3)
Abstract: Gaussian processes (GPs) are commonly used for geospatial analysis, but they suffer from high computational complexity when dealing with massive data. For instance, the log-likelihood function required in estimating the statistical model parameters for geospatial data is a computationally intensive procedure that involves computing the inverse of a covariance matrix with size n X n, where n represents the number of geographical locations. As a result, in the literature, studies have shifted towards approximation methods to handle larger values of n effectively while maintaining high accuracy. These methods encompass a range of techniques, including low-rank and sparse approximations. Vecchia approximation is one of the most promising methods to speed up evaluating the log-likelihood function. This study presents a parallel implementation of the Vecchia approximation, utilizing batched matrix computations on contemporary GPUs. The proposed implementation relies on batched linear algebra routines to efficiently execute individual conditional distributions in the Vecchia algorithm. We rely on the KBLAS linear algebra library to perform batched linear algebra operations, reducing the time to solution compared to the state-of-the-art parallel implementation of the likelihood estimation operation in the ExaGeoStat software by up to 700X, 833X, 1380X on 32GB GV100, 80GB A100, and 80GB H100 GPUs, respectively. We also successfully manage larger problem sizes on a single NVIDIA GPU, accommodating up to 1M locations with 80GB A100 and H100 GPUs while maintaining the necessary application accuracy. We further assess the accuracy performance of the implemented algorithm, identifying the optimal settings for the Vecchia approximation algorithm to preserve accuracy on two real geospatial datasets: soil moisture data in the Mississippi Basin area and wind speed data in the Middle East.
- R. Furrer, M. G. Genton, and D. Nychka, “Covariance tapering for interpolation of large spatial datasets,” Journal of Computational and Graphical Statistics, vol. 15, no. 3, pp. 502–523, 2006.
- C. G. Kaufman, M. J. Schervish, and D. W. Nychka, “Covariance tapering for likelihood-based estimation in large spatial data sets,” Journal of the American Statistical Association, vol. 103, no. 484, pp. 1545–1555, 2008.
- M. Bevilacqua, A. Fassò, C. Gaetan, E. Porcu, and D. Velandia, “Covariance tapering for multivariate Gaussian random fields estimation,” Statistical Methods & Applications, vol. 25, pp. 21–37, 2016.
- D. Nychka, S. Bandyopadhyay, D. Hammerling, F. Lindgren, and S. Sain, “A multiresolution Gaussian process model for the analysis of large spatial datasets,” Journal of Computational and Graphical Statistics, vol. 24, no. 2, pp. 579–599, 2015.
- S. Abdulah, H. Ltaief, Y. Sun, M. G. Genton, and D. E. Keyes, “ExaGeoStat: A high performance unified software for geostatistics on manycore systems,” IEEE Transactions on Parallel and Distributed Systems, vol. 29, no. 12, pp. 2771–2784, 2018.
- Abdulah, Sameh and Ltaief, Hatem and Sun, Ying and Genton, Marc G and Keyes, David E, “Geostatistical modeling and prediction using mixed precision tile cholesky factorization,” in 2019 IEEE 26th international conference on high performance computing, data, and analytics (HiPC). IEEE, 2019, pp. 152–162.
- S. Abdulah, Q. Cao, Y. Pei, G. Bosilca, J. Dongarra, M. G. Genton, D. E. Keyes, H. Ltaief, and Y. Sun, “Accelerating geostatistical modeling and prediction with mixed-precision computations: A high-productivity approach with parsec,” IEEE Transactions on Parallel and Distributed Systems, vol. 33, no. 4, pp. 964–976, 2021.
- Q. Cao, S. Abdulah, R. Alomairy, Y. Pei, P. Nag, G. Bosilca, J. Dongarra, M. G. Genton, D. E. Keyes, H. Ltaief et al., “Reshaping geostatistical modeling and prediction for extreme-scale environmental applications,” in 2022 SC22: International Conference for High Performance Computing, Networking, Storage and Analysis (SC). IEEE Computer Society, 2022, pp. 13–24.
- M. Katzfuss and N. Cressie, “Spatio-temporal smoothing and em estimation for massive remote-sensing data sets,” Journal of Time Series Analysis, vol. 32, no. 4, pp. 430–446, 2011.
- H. Huang and Y. Sun, “Hierarchical low rank approximation of likelihoods for large spatial datasets,” Journal of Computational and Graphical Statistics, vol. 27, no. 1, pp. 110–118, 2018.
- S. Abdulah, H. Ltaief, Y. Sun, M. G. Genton, and D. E. Keyes, “Parallel approximation of the maximum likelihood estimation for the prediction of large-scale geostatistics simulations,” in 2018 IEEE international conference on cluster computing (CLUSTER). IEEE, 2018, pp. 98–108.
- S. Mondal, S. Abdulah, H. Ltaief, Y. Sun, M. G. Genton, and D. E. Keyes, “Parallel approximations of the tukey g-and-h likelihoods and predictions for non-Gaussian geostatistics,” in 2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS). IEEE, 2022, pp. 379–389.
- A. V. Vecchia, “Estimation and model identification for continuous spatial processes,” Journal of the Royal Statistical Society Series B: Statistical Methodology, vol. 50, no. 2, pp. 297–312, 1988.
- M. Katzfuss and J. Guinness, “A general framework for Vecchia approximations of Gaussian processes,” 2021.
- M. Katzfuss, J. Guinness, and E. Lawrence, “Scaled Vecchia approximation for fast computer-model emulation,” SIAM/ASA Journal on Uncertainty Quantification, vol. 10, no. 2, pp. 537–554, 2022.
- J. Zhang and M. Katzfuss, “Multi-scale Vecchia approximations of Gaussian processes,” Journal of Agricultural, Biological and Environmental Statistics, vol. 27, no. 3, pp. 440–460, 2022.
- F. Jimenez and M. Katzfuss, “Scalable Bayesian optimization using vecchia approximations of Gaussian processes,” in International Conference on Artificial Intelligence and Statistics. PMLR, 2023, pp. 1492–1512.
- “The Top 500 List,” https://top500.org/, [Online; accessed 28-November-2023].
- A. Haidar, T. Dong, P. Luszczek, S. Tomov, and J. Dongarra, “Batched matrix computations on hardware accelerators based on GPUs,” The International Journal of High Performance Computing Applications, vol. 29, no. 2, pp. 193–208, 2015.
- A. Abdelfattah, S. Tomov, and J. Dongarra, “Fast batched matrix multiplication for small sizes using half-precision arithmetic on GPUs,” in 2019 IEEE international parallel and distributed processing symposium (IPDPS). IEEE, 2019, pp. 111–122.
- J. Guinness, “Permutation and grouping methods for sharpening Gaussian process approximations,” Technometrics, vol. 60, no. 4, pp. 415–429, 2018.
- Guinness, Joseph, “Gaussian process learning via Fisher scoring of Vecchia’s approximation,” Statistics and Computing, vol. 31, no. 3, p. 25, 2021.
- H. Huang, S. Abdulah, Y. Sun, H. Ltaief, D. E. Keyes, and M. G. Genton, “Competition on spatial statistics for large datasets,” Journal of Agricultural, Biological and Environmental Statistics, vol. 26, pp. 580–595, 2021.
- S. Abdulah, F. Alamri, P. Nag, Y. Sun, H. Ltaief, D. E. Keyes, and M. G. Genton, “The second competition on spatial statistics for large datasets,” arXiv preprint arXiv:2211.03119, 2022.
- Y. Hong, Y. Song, S. Abdulah, Y. Sun, H. Ltaief, D. E. Keyes, and M. G. Genton, “The third competition on spatial statistics for large datasets,” Journal of Agricultural, Biological and Environmental Statistics, pp. 1–18, 2023.
- R. Huser, M. L. Stein, and P. Zhong, “Vecchia likelihood approximation for accurate and fast inference in intractable spatial extremes models,” arXiv preprint arXiv:2203.05626, 2022.
- Q. Vu, A. Zammit-Mangion, and S. J. Chuter, “Constructing large nonstationary spatio-temporal covariance models via compositional warpings,” Spatial Statistics, vol. 54, p. 100742, 2023.
- J. Zhang, S. You, and L. Gruenwald, “Large-scale spatial data processing on GPUs and GPU-accelerated clusters,” Sigspatial Special, vol. 6, no. 3, pp. 27–34, 2015.
- X. Li, T. Huang, D.-T. Lu, and C. Niu, “Accelerating experimental high-order spatial statistics calculations using GPUs,” Computers & Geosciences, vol. 70, pp. 128–137, 2014.
- J. Zhang, S. You, and L. Gruenwald, “Efficient parallel zonal statistics on large-scale global biodiversity data on GPUs,” in Proceedings of the 4th International ACM SIGSPATIAL Workshop on Analytics for Big Geospatial Data, 2015, pp. 35–44.
- G. Zhang, A.-X. Zhu, and Q. Huang, “A GPU-accelerated adaptive kernel density estimation approach for efficient point pattern analysis on spatial big data,” International Journal of Geographical Information Science, vol. 31, no. 10, pp. 2068–2097, 2017.
- S. K. Prasad, M. McDermott, S. Puri, D. Shah, D. Aghajarian, S. Shekhar, and X. Zhou, “A vision for GPU-accelerated parallel computation on geo-spatial datasets,” SIGSPATIAL Special, vol. 6, no. 3, pp. 19–26, 2015.
- K. Wang, S. Abdulah, Y. Sun, and M. G. Genton, “Which parameterization of the Matérn covariance function?” Spatial Statistics, vol. 58, p. 100787, 2023.
- J. Duchi, “Derivations for linear algebra and optimization,” Berkeley, California, vol. 3, no. 1, pp. 2325–5870, 2007.
- M. Gates, J. Kurzak, A. Charara, A. YarKhan, and J. Dongarra, “SLATE: Design of a modern distributed and accelerated linear algebra library,” in Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2019, pp. 1–18.
- J. Dongarra, M. Gates, A. Haidar, J. Kurzak, P. Luszczek, P. Wu, I. Yamazaki, A. YarKhan, M. Abalenkovs, N. Bagherpour et al., “PLASMA: Parallel linear algebra software for multicore using openmp,” ACM Transactions on Mathematical Software (TOMS), vol. 45, no. 2, pp. 1–35, 2019.
- S. Abdulah, K. Akbudak, W. Boukaram, A. Charara, D. Keyes, H. Ltaief, A. Mikhalev, D. Sukkari, and G. Turkiyyah, “Hierarchical computations on manycore architectures (hicma),” See http://github. com/ecrc/hicma, 2019.
- S. Zampini, W. Boukaram, G. Turkiyyah, O. Knio, and D. Keyes, “H2opus: a distributed-memory multi-GPU software package for non-local operators,” Advances in Computational Mathematics, vol. 48, no. 3, p. 31, 2022.
- E. Agullo, J. Demmel, J. Dongarra, B. Hadri, J. Kurzak, J. Langou, H. Ltaief, P. Luszczek, and S. Tomov, “Numerical linear algebra on emerging architectures: The PLASMA and MAGMA projects,” in Journal of Physics: Conference Series, vol. 180, no. 1. IOP Publishing, 2009, p. 012037.
- A. Abdelfattah, D. Keyes, and H. Ltaief, “KBLAS: An optimized library for dense matrix-vector multiplication on GPU accelerators,” ACM Transactions on Mathematical Software (TOMS), vol. 42, no. 3, pp. 1–31, 2016.
- F. G. Van Zee and R. A. Van De Geijn, “BLIS: A framework for rapidly instantiating BLAS functionality,” ACM Transactions on Mathematical Software (TOMS), vol. 41, no. 3, pp. 1–33, 2015.
- T. Dong, A. Haidar, P. Luszczek, S. Tomov, A. Abdelfattah, and J. Dongarra, “MAGMA batched: A batched BLAS approach for small matrix factorizations and applications on GPUs,” Technical Report. Technical report, Tech. Rep., 2016.
- J. Dongarra, S. Hammarling, N. J. Higham, S. D. Relton, P. Valero-Lara, and M. Zounon, “The design and performance of batched BLAS on modern high-performance computing systems,” Procedia Computer Science, vol. 108, pp. 495–504, 2017.
- K. Akbudak, H. Ltaief, A. Mikhalev, and D. Keyes, “Tile low rank cholesky factorization for climate/weather modeling applications on manycore architectures,” in International Conference on High Performance Computing. Springer, 2017, pp. 22–40.
- Z. Geng, S. Abdulah, H. Ltaief, Y. Sun, M. G. Genton, and D. E. Keyes, “GPU-accelerated dense covariance matrix generation for spatial statistics applications,” 2023.
- N. W. Chaney, P. Metcalfe, and E. F. Wood, “HydroBlocks: A field-scale resolving land surface model for application over continental extents,” Hydrological Processes, vol. 30, no. 20, pp. 3543–3559, 2016.
- J. Powers, X.-Y. Huang, B. Klemp, C. Skamarock, J. Dudhia, and O. Gill, “A description of the advanced research WRF version 2,” NCAR tech, vol. 15, 2008.
- Qilong Pan (4 papers)
- Sameh Abdulah (23 papers)
- Marc G. Genton (85 papers)
- David E. Keyes (27 papers)
- Hatem Ltaief (25 papers)
- Ying Sun (154 papers)