Local Moran’s I (LISA)
- Local Moran’s I (LISA) is a statistical measure that decomposes spatial autocorrelation into local values to identify clusters (hot-spots and cold-spots) and spatial outliers.
- It utilizes various normalization techniques, such as global and row-normalized weights along with population variance standardization, ensuring consistency with global Moran’s I.
- Inference is based on permutation testing and advanced visualization methods, making it widely applicable in spatial econometrics, epidemiology, and urban studies.
Local Moran’s I (LISA) provides a formal, spatially explicit decomposition of spatial autocorrelation by quantifying the association between a value at a geographic location and the values observed at its spatial neighbors. As a local indicator of spatial association, it extends the global Moran’s I statistic, producing a separate measure for each spatial unit (“location”) that can detect local clusters (“hot-spots” and “cold-spots”) as well as local spatial outliers. The rigorous mathematical foundation, multiple forms of weights normalization, inferential permutation testing, and recent advances in visualization and normalization schemes underpin its broad uptake in geospatial modeling, spatial econometrics, epidemiology, and other fields.
1. Mathematical Definitions and Normalization Schemes
The classical (univariate) Local Moran’s I for an attribute vector over spatial units is defined, after mean-removal and (optionally) standardization, using a spatial weights matrix to encode the neighborhood structure. Mathematically:
where is the global mean. This is formally a local “gamma index” of matrix similarity, and up to proportionality can be seen as the th row "projection" of the global Moran’s I calculation (Wang et al., 2021).
Normalization of both weights and variables fundamentally alters the inferential and interpretive properties. Three prominent variants have been formally analyzed (Chen, 2022):
| Index | Weights Matrix | Variable Type | Formula | Proportionality to Global ? |
|---|---|---|---|---|
| MI₁ | un-normalized | mean-centered | Yes (factor 0) | |
| MI₂ | row-normalized 1 | standardized 2 | 3 | No |
| MI₃ | global-normalized 4 | standardized 5 | 6 | Yes (7) |
Here, 8, 9 (where 0 is the population variance), 1 is the raw (un-normalized) adjacency, 2 with 3, and 4 with 5.
Of these, MI₃ (using global-normalized weights and standardization by population variance) uniquely satisfies Anselin’s second requirement that “the sum of LISAs for all observations is proportional to a global indicator of spatial association,” with constant 6 (Chen, 2022).
2. Panel and Multivariate Extensions
For spatial panel data—multiple outcomes or residuals 7 measured for each region 8 across time points 9—Local Moran’s I generalizes to a multivariate inner product form. Let 0 denote the 1-dimensional vector for region 2, 3 the across-region mean vector, and 4 a positive-definite 5 matrix (often the identity). The panel-data LISA is:
6
Matrix notation enables the simultaneous calculation of all 7 entries as the diagonal of 8, where 9 is the 0 data matrix (Wang et al., 2021).
A plausible implication is that this generalization retains theoretical properties (e.g., local/global decomposability under global-normalized weights) under appropriate centering and normalization, but with the local statistic now quantifying multivariate (“vector”) similarity.
3. Computational Algorithms and Inference
LISA values are not interpreted without significance testing. A standard workflow is:
- Standardize variables (usually via z-scores or mean-centering and population variance scaling).
- Construct an appropriate weights matrix (e.g., binary adjacency, inverse distance, with possible normalizations).
- For each region 1, compute 2.
- Assess significance via permutation: with region 3’s value fixed, permute neighbor labels 4 times, recompute 5 for each, and estimate a pseudo 6-value
7
- Apply multiple-testing correction, e.g., Benjamini–Hochberg FDR, to control for family-wise error (Mason et al., 2024, Wang et al., 2021).
For spatial panel models, analytic (non-sampling) bounds based on concentration inequalities can provide fast 8 computation of 9-values, avoiding the cost of 0 enumerations or Monte Carlo approximation: 1 where 2 is the upper incomplete gamma, and the constants are calculated from the weight structure and similarity function (Wang et al., 2021).
4. Interpretation and Classification of Local Patterns
The sign and magnitude of each 3 are interpreted in conjunction with the value at 4 and the spatial lag 5. Classical quadrant-based classification yields four cluster/outlier types:
- High–High (HH): 6, 7, 8—localized hot-spot (high value surrounded by highs)
- Low–Low (LL): 9, 0, 1—cold-spot (low value among lows)
- High–Low (HL): 2, 3, 4—high-value outlier amidst lows
- Low–High (LH): 5, 6, 7—low outlier among highs
A significant positive 8 indicates the attribute at 9 moves with its neighbors; a significant negative 0 marks a spatial outlier. Final map labeling should always consider both the sign of 1 and the accompanying 2/3 values (Mason et al., 2024).
5. Visualization Techniques
Advanced interactive visualizations have been proposed to facilitate exploration and interpretation of Local Moran’s I statistics. Three systems highlighted include (Mason et al., 2024):
- Moran Dual-Density Plot: Shows the distribution of 4-scores for the whole study region (with neighbors marked), the spatial lag, the computed 5, and the permutation-based significance.
- Moran Network Scatterplot: Displays the Moran scatterplot (6 vs. 7), with neighbor relations rendered interactively (edges colored and scaled by weight, neighbor cluster labels).
- Spatial Lag Radial Plot: Radially arranges neighbors around 8 by bearing, with distance set by 9, and marker size by 0; a dashed circle indicates 1.
Brushing and linking allow coordinated updates across views, with map-based, scatterplot, and spatial context jointly updating as the user examines different locations, promoting holistic understanding of spatial patterns.
6. Comparative Properties and Best Practices
Key methodological considerations and comparisons include:
- Global-normalized weights and population variance standardization (MI₃) uniquely ensure that LISA values sum exactly to the global Moran’s I (Chen, 2022).
- Row normalization is often used in practical software and visualization for interpretability, but such MI₂ LISA values do not decompose the global measure.
- Choice of weights (contiguity, distance, or inverse distance) and normalization method critically impact detected patterns.
- Multiple testing is inherent: reporting cluster significance demands Type I error control; FDR procedures are standard (Wang et al., 2021, Mason et al., 2024).
- Caution is warranted for sparse neighborhoods: unstable significance inference and wide null distributions can result for locations with very few neighbors.
- For panel models, LISA can be directly applied to model residuals, supporting model checking for spatial dependence in spatial econometric regression (Wang et al., 2021).
7. Practical Applications and Case Studies
Local Moran’s I (LISA) is applied to diverse spatial problems. Notable recent implementations include:
- Electoral Panel Model: Over 3,100 U.S. counties across five presidential elections (2000–2016). Covariates included population density, median income, and racial composition. LISA identified statistically significant Republican “corridors” and Democratic clusters, guiding substantive political analysis (Wang et al., 2021).
- Cancer Mortality Studies: Age-adjusted cancer deaths in U.S. counties (2011–2020), with HH and LL clusters mapped at fine spatial scale. Visualization dashboards (available as open-source moranplot library) bring transparency and contextualization to spatial “hot-spot” findings (Mason et al., 2024).
- Urban Demography: Empirical verification on Beijing–Tianjin–Hebei regional data demonstrates the exactness of MI₃ sum-to-global proportionality and illustrates how alternate weightings affect the cluster identification (Chen, 2022).
These studies exemplify the broad applicability and rigorous interpretability afforded by formally constructed LISA statistics, in both univariate and panel settings.