Papers
Topics
Authors
Recent
2000 character limit reached

Fast Spatial Autocorrelation

Published 17 Oct 2020 in stat.ME and cs.DS | (2010.08676v1)

Abstract: Physical or geographic location proves to be an important feature in many data science models, because many diverse natural and social phenomenon have a spatial component. Spatial autocorrelation measures the extent to which locally adjacent observations of the same phenomenon are correlated. Although statistics like Moran's $I$ and Geary's $C$ are widely used to measure spatial autocorrelation, they are slow: all popular methods run in $\Omega(n2)$ time, rendering them unusable for large data sets, or long time-courses with moderate numbers of points. We propose a new $S_A$ statistic based on the notion that the variance observed when merging pairs of nearby clusters should increase slowly for spatially autocorrelated variables. We give a linear-time algorithm to calculate $S_A$ for a variable with an input agglomeration order (available at https://github.com/aamgalan/spatial_autocorrelation). For a typical dataset of $n \approx 63,000$ points, our $S_A$ autocorrelation measure can be computed in 1 second, versus 2 hours or more for Moran's $I$ and Geary's $C$. Through simulation studies, we demonstrate that $S_A$ identifies spatial correlations in variables generated with spatially-dependent model half an order of magnitude earlier than either Moran's $I$ or Geary's $C$. Finally, we prove several theoretical properties of $S_A$: namely that it behaves as a true correlation statistic, and is invariant under addition or multiplication by a constant.

Citations (4)

Summary

We haven't generated a summary for this paper yet.

Whiteboard

Paper to Video (Beta)

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.