Kernel SHAP: Model-Agnostic Explanations

Updated 14 October 2025

Kernel SHAP is a model-agnostic method that explains predictions by approximating Shapley values through weighted linear regression on binary feature subsets.
It improves computational efficiency and reduces variance using sampling strategies like paired, deterministic, and leverage score sampling techniques.
Extensions of Kernel SHAP address higher-order interactions, stability issues, and inherent feature dependencies to enhance interpretability and performance.

Kernel SHAP is a model-agnostic method for interpreting individual predictions from complex machine learning models by approximating Shapley values—the unique solution among additive feature attribution methods satisfying local accuracy, missingness, and consistency. It frames the explanation of a model’s prediction as a local weighted linear regression over binary feature subsets, with a theoretically derived weighting kernel grounded in cooperative game theory. Since its introduction, extensive technical work has tackled statistical properties, algorithmic design, computational efficiency, stability, kernel structure, and provable correctness of KernelSHAP and its variants.

1. Theoretical Foundations and Formulation

Kernel SHAP operationalizes Shapley values in the general class of additive feature attribution methods: explanations are framed as linear models

$g(z') = \phi_0 + \sum_{i=1}^M \phi_i z'_i,$

where $z' \in \{0,1\}^M$ encodes the inclusion or exclusion of features, and each $\phi_i$ quantifies the contribution of feature $i$ . For a black-box model $f$ and input $x$ , the prediction is interpreted as $f(x) = g(x')$ with $x'$ the binarized indicator corresponding to $x$ .

The unique solution for $\{\phi_i\}_{i=1}^M$ is the Shapley value: $\phi_i(f, x) = \sum_{S \subseteq M \setminus \{i\}} \frac{|S|!(M - |S| - 1)!}{M!} [v_{x}(S \cup \{i\}) - v_{x}(S)],$ where $v_x(S)$ is the expected model output with only features in $S$ known. This solution is uniquely characterized by local accuracy, missingness, and consistency (see (Lundberg et al., 2017)).

Kernel SHAP solves for $\phi$ by casting the explanation as a weighted linear regression: $\min_{\phi_0, \phi} \sum_{z'} \pi_{x'}(z')\left[f_x(z') - g(z')\right]^2,$ with $\pi_{x'}(z') = \frac{M-1}{\binom{M}{|z'|}|z'|(M-|z'|)}$ (diverging for $|z'|=0$ or $M$ , for which constraints are enforced) (Lundberg et al., 2017).

2. Computational Strategies, Sampling, and Efficiency

Direct computation requires model evaluation for every $2^M$ coalition, which quickly becomes infeasible. Kernel SHAP circumvents this via weighted sampling of coalitions, recycling information across all features using a carefully derived Shapley kernel. The regression is solved over these sampled points, subject to sum-to-prediction constraints.

Variance reduction and efficiency improvements include:

Paired Sampling: Simultaneous sampling of every coalition and its complement substantially reduces variance and, for interactions of at most order two, gives exact Shapley values (Covert et al., 2020, Mayer et al., 18 Aug 2025). When the value function is quadratic, the paired formulation cancels out sampling error entirely.
Deterministic Weighting: Replacing stochastic sampling weights with deterministic approximations by averaging within coalition sizes (paired average, c-kernel, cel-kernel) more efficiently leverages the sampled coalitions, reducing the number of unique function evaluations needed by up to 95% compared to standard strategies (Olsen et al., 7 Oct 2024).
Leverage Score Sampling: Sampling coalitions according to their leverage scores within the design matrix allows for provably accurate Shapley value estimation with $O(n \log n)$ samples, yielding tight non-asymptotic error bounds (Musco et al., 2 Oct 2024).
Unified Least Squares Framework: A variety of recent advances, including unbiased KernelSHAP, LeverageSHAP, and modified least squares estimators, can be understood within a single regression framework using different sketching matrices and constraints, leading to quantitative sample complexity bounds (Chen et al., 5 Jun 2025).

Theoretical results establish convergence rates (central limit theorem, $O(1/\sqrt{n})$ asymptotics), variance decomposition for sampling strategies, and explicit error bounds in terms of sample size and sampling distribution (Covert et al., 2020, Mayer et al., 18 Aug 2025, Musco et al., 2 Oct 2024, Chen et al., 5 Jun 2025).

3. Extensions: Stability, Interactions, and Model Structure

Stability: Standard Kernel SHAP sampling introduces instability (explanation variance between runs) due to random neighbor selection among high-layer coalitions. Restricting sampling to fully enumerate complete layers (e.g., all coalitions of size 1) guarantees full reproducibility and still produces meaningful attributions closely matching the full SHAP solution (Kelodjou et al., 2023).

Higher-Order Interactions: KernelSHAP-IQ extends the WLS machinery to estimate Shapley interaction indices (SIIs) of arbitrary order by iteratively fitting models for additive, pairwise, and up to $k$ -way feature interactions (Fumagalli et al., 17 May 2024). This enables explanations encompassing not just individual contributions but also synergistic or antagonistic effects among feature groups.

Model Structural Information: For models with low-order interactions or explicit functional ANOVA decompositions, exact SHAP values can be computed in polynomial time by combining contributions from low-dimensional functional components (Hu et al., 2023, Mohammadi et al., 22 May 2025). For linear/additive models, this reduces to simple difference formulas; for product-kernel RKHS models, recursive formulations with elementary symmetric polynomials achieve exactness and efficiency.

4. Kernel SHAP in Practice: Applications, Implementation, and Feature Selection

Kernel SHAP is directly applied to explain predictions from diverse black-box models (autoencoders, random forests, XGBoost, and deep networks) in domains ranging from finance to anomaly detection (Roshan et al., 2023, Kariyappa et al., 2023). Its model-agnostic and local nature allows both feature selection (by filtering for features with large attributions), model debugging, and regulatory reporting.

A key practical pitfall concerns global feature selection: naive averaging of per-instance SHAP values can fail to detect true dependencies when attributions are aggregated over the training support. Recent theoretical analysis proves that safe feature removal—guaranteeing that a feature is non-influential—can only be justified when aggregation is performed over the product of marginals (the extended support). This is operationalized by recomputing explanations on a column-wise permuted (scrambled) dataset to simulate feature independence (Bhattacharjee et al., 29 Mar 2025). The Shapley Lie algebra framework underpins these guarantees, connecting the invertibility and structure of value operators to the identifiability of redundant features.

5. Limitations, Trade-offs, and Recent Advances

Kernel SHAP’s accuracy is contingent on the number and representativity of sampled coalitions—particularly in high-dimensional, dependent, or non-additive models. While paired, deterministic, and leverage-based sampling have improved statistical efficiency and runtime, exactness is only possible for models with suitable structure, or in low dimensions.

Variance–Bias Trade-off: Original KernelSHAP enjoys lower variance with negligible bias, whereas unbiased versions are easier to analyze theoretically, but may incur higher sample cost (Covert et al., 2020).
Additive Recovery: Paired PermutationSHAP recovers group-wise sums exactly in additive settings, but KernelSHAP in its WLS form does not strictly possess this property, limiting interpretability in strictly partitioned models (Mayer et al., 18 Aug 2025).

Kernel SHAP also assumes feature independence when forming conditional expectations for masked features; for highly dependent data, errors in conditional distribution estimation may distort attributions.

Recent work has broadened the analytical framework for local explanations (using alternative kernels, Choquet integrals, and k-additive games), proposed robust uncertainty quantification methods, and improved scaling and faithfulness on high-dimensional data (Pelegrina et al., 2022, Hiraki et al., 1 Jun 2024, Chen et al., 5 Jun 2025).

6. Future Directions

Open research areas include:

Further reduction of computational complexity in large $M$ regimes, especially by leveraging sparsity, functional decomposition, and alternative kernel choices.
Better handling of feature dependence when performing conditional expectation calculations in the local surrogate model.
Integrating layer-wise and structure-exploiting variants for improved stability and interpretability, particularly in real-time and high-frequency explanation settings (Kelodjou et al., 2023, Mohammadi et al., 22 May 2025).
Extending provable guarantees to additional modalities (e.g., time series, graphs) and for complex global model explanation settings (Villani et al., 2022, Han et al., 25 Nov 2024).
Deeper exploration of algebraic structure (e.g., Shapley Lie algebra) to support axiomatic identification of redundant features under varied data distributions (Bhattacharjee et al., 29 Mar 2025).

In sum, Kernel SHAP is a theoretically principled, unified regression-based approach for local model interpretation that has inspired a broad family of efficient, stable, and provably accurate Shapley value estimation techniques for both individual and interactive feature attributions across a diverse array of machine learning applications.