Clustering the Nearest Neighbor Gaussian Process (2501.10656v1)

Published 18 Jan 2025 in stat.ME

Abstract: Gaussian processes are ubiquitous as the primary tool for modeling spatial data. However, the Gaussian process is limited by its $\mathcal{O}(n^3)$ cost, making direct parameter fitting algorithms infeasible for the scale of modern data collection initiatives. The Nearest Neighbor Gaussian Process (NNGP) was introduced as a scalable approximation to dense Gaussian processes which has been successful for $n\sim 10^6$ observations. This project introduces the $\textit{clustered Nearest Neighbor Gaussian Process}$ (cNNGP) which reduces the computational and storage cost of the NNGP. The accuracy of parameter estimation and reduction in computational and memory storage requirements are demonstrated with simulated data, where the cNNGP provided comparable inference to that obtained with the NNGP, in a fraction of the sampling time. To showcase the method's performance, we modeled biomass over the state of Maine using data collected by the Global Ecosystem Dynamics Investigation (GEDI) to generate wall-to-wall predictions over the state. In 16% of the time, the cNNGP produced nearly indistinguishable inference and biomass prediction maps to those obtained with the NNGP.

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Tweets

https://twitter.com/tom_ohigashi/status/1882021913419293156

https://twitter.com/StatMEPapers/status/1882277397548171526

Clustering the Nearest Neighbor Gaussian Process (2501.10656v1)

Summary

Related Papers

Tweets