SHAP Value Foundations in XAI
- SHAP Value Foundations are a rigorous method for attributing feature contributions based on four fairness axioms: efficiency, symmetry, dummy, and additivity.
- SHAP and SAGE instantiate these principles for local and global explanations, quantifying feature impact on predictions and loss reduction respectively.
- Practical challenges include misattribution in feature selection, handling redundant features, and addressing pathologies in high-dimensional, non-monotonic data.
The Shapley value provides a unique, axiomatically justified method to allocate credit among features for a model’s prediction or performance, based on the cooperative game theory framework of transferable-utility games. Within explainable artificial intelligence (XAI), Shapley values have been instantiated as SHAP (SHapley Additive exPlanations) for local explanation and SAGE (Shapley Additive Global importancE) for global feature importance, dominating much of the attribution landscape. However, the axiomatic underpinnings that guarantee fairness and uniqueness also impose limitations that manifest as pathologies when Shapley values are applied naively to feature selection or highly correlated, high-dimensional data.
1. Transferable-Utility Games and the Shapley Value
The foundations of Shapley values originate in the framework of transferable-utility (TU) cooperative games. Given a set of players (or features) and a characteristic function with , the Shapley value for player is defined as
This expression averages the marginal contribution of feature to every possible coalition, weighted by the probability that precedes in a random order. The Shapley value uniquely attributes the overall gain back to the individual features (Fryer et al., 2021).
2. The Four Axioms: Fairness Guarantees
Shapley values satisfy four “favourable and fair” axioms:
| Axiom | Mathematical Statement | Interpretation |
|---|---|---|
| (A1) Efficiency | Total credit equals the total value of the grand coalition | |
| (A2) Symmetry | Indistinguishable features receive equal credit | |
| (A3) Dummy (Null player) | Irrelevant features get zero attribution | |
| (A4) Additivity | Value allocation is linear over characteristic functions |
These axioms ensure that Shapley values are the only decomposition that is simultaneously fair (symmetry), omitting dummies (dummy), additive (additivity), and fully explains the total value (efficiency) (Fryer et al., 2021, Lundberg et al., 2017).
3. Uniqueness Argument and the Role of Unanimity Games
The uniqueness of the Shapley value follows from linearity (additivity), efficiency, and symmetry. The key idea is that the space of games admits a basis of unanimity games (where ), and the Shapley value on these games is uniquely determined by the axioms: Any characteristic function can be decomposed into a linear combination of unanimity games, so the axioms force the Shapley solution to be unique (Fryer et al., 2021).
4. Embedding SHAP and SAGE Within This Framework
SHAP (Local Explanations)
SHAP instantiates the characteristic function for an instance as
where features in are fixed to their observed values in and the rest are marginalized. The SHAP attribution for feature is . The mean absolute SHAP, , is often reported as a global score (Fryer et al., 2021).
SAGE (Global Explanations)
SAGE uses a “loss decrease” formulation: quantifying the average reduction in loss by revealing features . Shapley values over this global game produce SAGE importance scores (Fryer et al., 2021).
5. Limitations and Pathologies in SHAP-Based Feature Selection
While theoretically compelling, the four axioms do not guarantee that Shapley-based rankings recover optimal feature subsets.
- Model-averaging vs. optimal coalitions: Shapley values average over all submodels due to additivity and efficiency, not just the optimal predictors. For instance, in a “taxicab” game where a dominant feature always drives the value, Shapley still allocates nonzero credit to subordinate features that help in suboptimal coalitions.
- Non-monotonic criteria “waste” value: If the value function is non-monotonic (e.g., penalizing model complexity as in AIC/BIC), Shapley’s efficiency axiom can allocate credit to inferior models, reducing interpretability.
- Symmetry versus redundancy: When features are perfectly correlated, symmetry forces even allocation, even if a feature is redundant. This mechanism can inflate or deflate importance of the truly causal feature set.
- Secret-holder and Markov-boundary pathologies: Features critical to the best small submodels (the “secret-holder”) may be under-credited if their value is unlocked only in certain combinations. SHAP can also overemphasize proxies (non-Markov boundary members) in highly correlated systems, while SAGE, by measuring global loss decrease, partially mitigates this (Fryer et al., 2021).
6. Algebraic and Distributional Refinements: Safe Feature Elimination and the Extended Distribution
Recent theoretical advances demonstrate that aggregating SHAP values only over the observed data distribution may miss true variable dependence. Specifically, it is possible to construct functions whose SHAP values are zero everywhere on the data support, yet which genuinely depend on a feature (Bhattacharjee et al., 29 Mar 2025). The soundness guarantee is thus established under the extended support—the product of feature marginals. If
then is -close (with loss at most ) to a function independent of . For empirical SHAP (e.g., KernelSHAP), this guarantee is realized by independently permuting each feature column in the data matrix to simulate product-of-marginals sampling—a procedure fully justified by the algebraic structure (in particular, the solvability and simultaneous triangularization of the Shapley Lie algebra generated by interventional-value operators on the function space) (Bhattacharjee et al., 29 Mar 2025).
7. Conclusion and Implications for Model Interpretability
The theoretical foundations of SHAP and SAGE are grounded in a rigorous set of axioms that guarantee fairness, consistency, and uniqueness. However, practical deployment—particularly for feature selection—requires careful attention to the choice of characteristic function, the structure of the data distribution, and the interaction patterns among features. Naively applying SHAP-based rankings to select variables, without adjustments for model structure or feature dependencies, may yield misleading results. Advances such as extended-distribution aggregation and algebraic analysis provide new pathways to sound variable elimination, but also underscore the continuing need to tailor the axiomatic framework to model selection desiderata and underlying data geometry (Fryer et al., 2021, Bhattacharjee et al., 29 Mar 2025).