GeoShapley: Spatial Attribution in ML

Updated 24 December 2025

GeoShapley is a unified, model-agnostic framework that quantifies spatial effects in ML predictions by extending Shapley values to include geographic context.
It employs a modified Kernel SHAP approach with spatial weighting and coalition sampling to derive locally adapted, spatially explicit attributions.
The framework enables granular decomposition of predictions, offering interpretable insights that align with established spatial statistics and econometric models.

GeoShapley is a unified, model-agnostic, theory-grounded framework for quantifying and decomposing spatial effects in predictions generated by ML or statistical models on geospatial tabular data. Rooted in cooperative game theory, GeoShapley extends the classical Shapley value approach by introducing geographic location as a first-class “player” and constructing spatially explicit attributions—including the main effects of features, the intrinsic effect of location, and their spatially structured interactions. This approach supports granular, interpretable decomposition of complex, nonlinear model predictions, yielding spatially varying explanations highly aligned with established concepts in geostatistics and spatial econometrics (Li, 2023, Li, 1 May 2025, Deng et al., 23 Jul 2025, Lu et al., 17 Dec 2025, Liu, 5 Mar 2024).

1. Theoretical Foundation and Formalization

GeoShapley originates from the Shapley value framework, which assigns a fair, axiomatic value to each feature by averaging its marginal contributions across all feature subsets (coalitions) in the prediction task. GeoShapley generalizes this setup as follows:

Players: $N = \{1, \dots, p, \mathrm{GEO}\}$ , where $1, \dots, p$ index non-spatial features and $\mathrm{GEO}$ denotes the joint spatial feature (e.g., longitude-latitude pair or higher-dimensional encoding).
Value function: For any coalition $S \subseteq N$ , $v(S, x) = \mathbb{E}_{X_{\bar{S}}}[f(x_S, X_{\bar{S}})]$ ; that is, the expected model output when features in $S$ are fixed at their observed values and other features are marginalized (with respect to a reference distribution).
Shapley allocation: For player $i \in N$ ,

$\phi_i(f, x) = \sum_{S \subseteq N \setminus \{i\}} \frac{|S|! (|N| - |S| - 1)!}{|N|!} \big[ v(S \cup \{i\}, x) - v(S, x) \big].$

Spatial extension: By elevating $\mathrm{GEO}$ to a joint player, GeoShapley is able to quantify (i) the intrinsic effect of location ( $\phi_{\mathrm{GEO}}$ ), (ii) marginal effects of each non-spatial feature ( $\phi_j$ ), and (iii) spatially varying interaction effects between each feature and location ( $\phi_{(\mathrm{GEO},j)}$ ).

The GeoShapley decomposition expresses a local prediction as

$\hat{y} = \phi_0 + \phi_{\mathrm{GEO}} + \sum_{j=1}^p \phi_j + \sum_{j=1}^p \phi_{(\mathrm{GEO},j)},$

where $\phi_0$ is the global base value (mean model output over background).

GeoShapley’s computation inherits the combinatorial complexity of classical Shapley (nominally $O(2^{|N|})$ forward passes per instance); practical implementations use Kernel SHAP approximations and spatial weighting to maintain tractability (Li, 2023, Deng et al., 23 Jul 2025, Li, 1 May 2025).

2. Algorithmic Implementation and Computational Strategies

GeoShapley explanations are typically realized by modifying the Kernel SHAP procedure:

Preparation: Enumerate $N$ players (including a joint GEO player). Choose a background reference set; for spatial data this may involve spatially stratified or geographically weighted samples.
Coalition sampling: For each local explanation, draw $M \ll 2^{|N|}$ random feature subsets $S$ .
Hybrid construction: For each $S$ , generate $x^{(S)}$ where features in $S$ take their actual value, others are sampled from (possibly spatially weighted) background data.
Spatial weighting: Apply a kernel function $w(i, m)$ —often Gaussian in the spatial distance between focal and background locations—so that the marginal contributions are locally adapted.
Shapley coefficient weighting: Standard combinatorial weights for each coalition.
Linear solution: Solve a weighted regression or (in linear models) a closed-form aggregation to obtain $\{\phi_0, \phi_{\mathrm{GEO}}, \phi_j, \phi_{(\mathrm{GEO},j)}\}$ .

Recent enhancements for deep models with geospatial attention, such as GeoAggregator, cache spatial neighbor lists via k-d trees, assemble “neighbor-aware” sequences for explanations even when features are masked, and ensemble over stochastic neighbor subsamples to stabilize both predictions and attributions (Deng et al., 23 Jul 2025).

For models with high computational cost, spatial kernel bandwidths and subset sizes $M$ must balance explainability and runtime; pipeline optimizations (e.g., neighbor caching, parallelization) are essential in large-scale settings (Deng et al., 23 Jul 2025, Lu et al., 17 Dec 2025, Li, 1 May 2025).

3. Interpretation, Decomposition, and Comparison to Spatial Models

GeoShapley directly parallels concepts in spatial statistics:

$\phi_{\mathrm{GEO}}$ recovers the intrinsic contextual effect of location, analogous to local intercepts (e.g., $f_0(\mathbf{u}, \mathbf{v})$ in geographically weighted regression, GWR).
$\phi_j$ measures the location-invariant effect of feature $j$ , akin to global additive main effects.
$\phi_{(\mathrm{GEO},j)}$ quantifies location-feature interactions, emulating spatially varying coefficients $B_j(\mathbf{u}, \mathbf{v})$ as in GWR/MGWR and spatial econometrics.

For linear spatial models of form $y = f_0(\mathbf{u}, \mathbf{v}) + \sum_j B_j(\mathbf{u}, \mathbf{v}) X_j + \varepsilon$ , GeoShapley locally approximates $B_j(\mathbf{u}, \mathbf{v}) \approx [\phi_j + \phi_{(\mathrm{GEO},j)}]/[X_j - \mathbb{E}(X_j)]$ (Li, 2023, Li, 1 May 2025). When features or spatial effects are strictly additive and globally homogeneous, GeoShapley reduces to ordinary SHAP.

Unlike GWR/MGWR, which strictly enforce local linearity, GeoShapley admits arbitrary nonlinear and non-additive model architectures (e.g., XGBoost, neural nets), and the local decomposition holds for any black-box regressor (Lu et al., 17 Dec 2025, Deng et al., 23 Jul 2025, Liu, 5 Mar 2024).

4. Empirical Validation and Comparative Insights

GeoShapley’s empirical performance and interpretability have been demonstrated in a range of domains:

Synthetic spatial processes: On spatial-lag or spatially varying regression testbeds, GeoShapley explanations derived from transformer-based models like GeoAggregator exhibit near-perfect recovery of true spatial coefficients; standard tree-based SHAP methods yield noisier, less interpretable attributions, often with spurious boundaries (Deng et al., 23 Jul 2025, Li, 2023).
Socio-demographic phenomena: In modeling U.S. county-level voting behavior and housing prices, GeoShapley attributions for location and feature-location interactions closely align with known geographic divisions or market effects and tease apart the roles of features such as education, race, or housing grade that are conflated in global models (Li, 1 May 2025, Li, 2023).
Applied geospatial analytics: Analysis of traffic crash density in Florida using ML+GeoShapley isolates sharply local risk factors (e.g., intersection density impacts in Miami vs. rural tracts), achieving higher $R^2$ and lower MAE than MGWR or standard SHAP (Lu et al., 17 Dec 2025).

Quantitatively, GeoShapley-coupled pipelines routinely produce higher predictive accuracy and truer spatial fidelity in recovered coefficients than tree-based SHAP, GWR, or MGWR (Deng et al., 23 Jul 2025, Lu et al., 17 Dec 2025, Li, 2023). The integration of ensembling in deep-spatial models further improves both performance and uncertainty quantification (Deng et al., 23 Jul 2025).

5. Practical Implementation and Recommendations

Practical use of GeoShapley involves model-agnostic workflows with flexible background selection, spatial kernel tuning, and efficient subset sampling. The open-source Python package geoshapley provides scikit-learn-compatible interfaces, with visualizations for spatial and marginal plots (Li, 2023, Li, 1 May 2025). Bandwidth selection may be data-adaptive (median inter-centroid distance, cross-validated minimization of error). For moderate to large $p$ , it is advisable to restrict attention to top- $k$ features or use feature grouping for tractability.

Bootstrap confidence intervals can be constructed by repeated model refitting and recomputation of attributions, though this is computationally intensive (Li, 1 May 2025). For high-dimensional or deep learner settings, caching, neighbor precomputation, and parallelization are essential (Deng et al., 23 Jul 2025).

Interpretation should focus primarily on $\phi_{\mathrm{GEO}}$ and $\phi_{(\mathrm{GEO},j)}$ when base ( $\phi_0$ ) and main ( $\phi_j$ ) effects are stably estimated. For high-dim geospatial encodings (e.g., learned spatial embeddings), the joint embedding should be treated as a single "GEO" player (Li, 2023).

6. Limitations and Future Directions

GeoShapley inherits the computational demands of Shapley-based approaches. For deep geospatial models, post-hoc explanations may require up to $\sim1800$ seconds per experiment versus under $100$ seconds for conventional tree-based SHAP (Deng et al., 23 Jul 2025). Algorithmic advances such as GPU-accelerated explanation, sparsity-aware or hierarchical approximation, and analytic bounds on approximation error are active areas for future development.

Spatial autocorrelation among background points remains an open issue: current GeoShapley variants typically treat location as an atomic feature and do not adjust for spatial clustering in the background. Incorporating spatial autocorrelation structure into the marginalization process may further enhance fidelity for datasets with spatially clustered outcomes (Deng et al., 23 Jul 2025, Lu et al., 17 Dec 2025).

Scaling to real-world, large-scale geospatial tasks (e.g., climate downscaling, epidemiology) will require further reduction in computational overhead and possible methodological adaptation—potentially hybridizing with model-specific or structure-exploiting explainers (Deng et al., 23 Jul 2025, Lu et al., 17 Dec 2025).

7. Relationship to the GeoShapley Family and Ensemble Frameworks

GeoShapley forms the foundational member of a broader family of spatially attuned XAI methodologies. Ensemble frameworks such as XGeoML extend the core principles by integrating geographically weighted sample pre-processing, local model fitting, and multiple explainers (SHAP, LIME, Feature Importance) into a cohesive local-decomposition pipeline (Liu, 5 Mar 2024).

These methods enhance the stability and fidelity of spatial explanations, mitigate high-frequency noise issues associated with “raw” Shapley maps, and enable robust attribution in the presence of complex nonlinear and interaction effects. The ensemble approach supports comprehensive uncertainty quantification and flexibility for both regression and classification problems in geospatial machine learning.

GeoShapley, in both its canonical and ensemble-extended forms, is now recognized as a rigorous bridge between spatial statistics and modern explainable ML, enabling reproducible, locally accurate, and interpretable decomposition of spatial predictions across diverse research domains (Liu, 5 Mar 2024, Lu et al., 17 Dec 2025, Li, 2023).