Shapley-Based & Interaction Explanations
- Shapley-based and interaction-focused explanations are principled methods that attribute both individual feature contributions and their higher-order interactions using axiomatic extensions of the classical Shapley value.
- They employ methodologies such as Shapley–Taylor, Faith-Shap, and SHAP-IQ to compute interaction effects through discrete derivatives, weighted regression, and sampling-based approximations.
- These approaches are adaptable to structured inputs like graphs, ensuring interpretability and model fidelity while addressing computational challenges in high-dimensional settings.
Shapley-based and interaction-focused explanations constitute a principled, axiomatically grounded framework for attributing the output of black-box machine learning models to both individual input features and their high-order interactions. These approaches generalize the classical Shapley value—originally developed in cooperative game theory for fair value allocation among players in a coalition game—to quantify the contributions of coalitions of features or components, providing uniquely justified decompositions of a model’s prediction that extend to arbitrary interaction orders and structured input domains.
1. Foundations: Shapley Value and Interaction Generalizations
The Shapley value for a pseudo-Boolean game (with , the set of features or "players") uniquely allocates the total surplus to individual features so as to satisfy four canonical axioms: dummy, symmetry, linearity, and efficiency. The closed-form expression is
A central extension challenge is defining attributions for arbitrary subsets with , corresponding to interaction effects among features. Multiple interaction indices have been developed, each characterized by their axiomatics, interaction-order truncation, and computational properties.
The Shapley–Taylor interaction index provides an order- decomposition, assigning attributions to all with such that and respecting higher-order discrete derivatives. The Faith-Shap index (Faithful Shapley Interaction index) (Tsai et al., 2022) generalizes this further by requiring interaction-extensions of the original four Shapley axioms and positing the interaction scores as coefficients of the most faithful -order polynomial regression fit to the pseudo-Boolean function. This yields unique, axiomatically natural interaction indices at each order.
2. Methodologies for Interaction Attribution
Interaction-focused indices are constructed using higher-order discrete derivatives and targeted aggregation across the powerset lattice:
- Shapley–Taylor index (Dhamdhere et al., 2019): For -th order, employs symmetrized discrete derivatives and combinatorial averages across coalition contexts; directly linked to truncated Taylor expansions of the multilinear extension of .
- Faith-Shap (Tsai et al., 2022): Solves a weighted regression over all subsets with weightings , ensuring exact fit at full/empty sets and generalized efficiency. Closed forms are provided via Möbius transforms and alternating sums, ensuring budget-balance and symmetry.
- Cardinal Interaction Indices and SHAP-IQ (Fumagalli et al., 2023): Systematic aggregation of context-sensitive discrete derivatives with interaction-order-dependent weights, permitting unified, unbiased Monte-Carlo estimation for any index satisfying linearity, symmetry, and dummy.
For practical computation, sampling-based approximations (e.g., SHAP-IQ, kernelSHAP generalizations), weighted least-squares fit, and modular permutation schemes are employed to mitigate exponential scaling with in evaluating all subsets.
Table: Leading Shapley Interaction Indices
| Index/Family | Unique Axioms | Efficiency | Truncation Order | Closed Form |
|---|---|---|---|---|
| Shapley–Taylor | +Interaction Dist. | Yes | Combinatorial (Theorem 1) | |
| Faith-Shap | Faithful -order regression | Yes | Möbius / Poly. Regression | |
| SHAP-IQ | CII: Lin, Symm, Dummy | Partial | Arbitrary | Unified sum over subsets |
Both the Shapley–Taylor and Faith-Shap indices are provably unique up to their axioms, with Faith-Shap additionally characterized by the regression-based construction (Tsai et al., 2022).
3. Algorithmic Implementations and Scalability
Efficient estimation and computation of interactions above order two present substantial algorithmic and computational challenges due to exponential term counts:
- Weighted Regression (KernelSHAP variants): For effects, weighted least-squares fitting yields consistent estimates with least-squares complexity in the general case; much reduced for sparse or structured models.
- Sampling/Monte Carlo (SHAP-IQ): For arbitrary order, all sampled coalitions are leveraged to simultaneously update all interaction estimates in time per model call, supporting order-of-magnitude speedups and variance control (Fumagalli et al., 2023).
- Specialized GNN Explainers (DistShap): Distributed implementation of the weighted regression on multi-GPU hardware enables edge-level attributions for graphs with up to millions of features, recovering higher-order edge synergies via Shapley-based linear surrogates (Akkas et al., 27 Jun 2025).
Feature grouping, targeted estimation (via importance heuristics), or order truncation (e.g., or 3) are standard practice for tractability.
4. Structure- and Context-Aware Extensions
Standard Shapley-based techniques assume all feature subsets are valid coalitions, but for structured inputs such as graphs, this can result in out-of-distribution or uninformative explanations.
- Myerson–Taylor Index and Structure-Awareness: For GNNs and general structured domains, the Myerson–Taylor index decomposes the value function by component connectivity, summing only over connected subgraphs/components (Bui et al., 2024). This ensures that disconnected or "pathological" coalitions never artificially receive cross-component scores, uniquely satisfying axioms of component efficiency, restricted null player, and interaction-distribution for structure-aware games. The MAGE algorithm leverages this to efficiently recover influential motifs and subgraphs, outperforming standard Shapley-based explainers on both fidelity and interpretability.
5. Empirical Behavior, Benchmarking, and Practical Limitations
Quantitative benchmarks consistently show that Shapley-based and interaction-focused explanations outperform symmetric or univariate attributions in faithfulness metrics, especially when significant feature synergy or redundancy exists (Sun et al., 2024, Bordt et al., 2022, Masoomi et al., 2023). For instance, bivariate or higher-order Shapley indices can recover truly interchangeable or antagonistic feature sets that univariate methods miss, and uniquely identify positive/negative interaction motifs in graphs.
However, these strengths entail interpretation complexity: pairwise (or higher-order) interaction matrices are difficult for humans to digest, motivating techniques for post-hoc clustering (e.g., via Louvain community detection to form span-based explanations) or graph condensation (e.g., identifying redundant/synergistic components via SCCs) (Sun et al., 2024, Masoomi et al., 2023). Computational costs, sampling noise, and scaling with set practical upper bounds on usable interaction order, with guidance typically recommending or 3 for most interpretable use cases.
Table: Diagnostic Strengths in Practice (Token/Text Models)
| Explanation Type | Faithfulness | Agreement with Humans | Simulatability | Complexity (Entropy) |
|---|---|---|---|---|
| TokenEx (Shapley) | High | Moderate/Low | Good in some cases | Low |
| TokenIntEx (Shapley) | High | Low | Moderate | Moderate/High |
| SpanIntEx (Shapley+Cl.) | Moderate | High | Highest | Low/Moderate |
Shapley-based interactions are "gold-standard" for faithfulness, while interaction-focused span explanations better support simulatability and alignment with annotated rationales (Sun et al., 2024).
6. Extensions: Decomposition, Dependence, and Directionality
Fundamental to Shapley-based explanations is the dependence on the choice of "value function" and imputation strategy:
- Conditional vs. Interventional Explanations: Conditional SHAP conveys model+data dependence; interventional SHAP isolates pure model effects. Recent work shows these can be exactly decomposed into interpretable direct (model) and indirect (data-dependence) parts per feature (Michiels et al., 2023).
- Directionality of Interactions: Standard Shapley indices are symmetric in set order. Frameworks such as bivariate directional Shapley graphs capture asymmetric interactions ("j influences i, but not vice versa"), facilitating the discovery of source/sink feature groups and interchangeability (Masoomi et al., 2023).
- Mitigation of Spurious Interactions: Asymmetric bias introduced by suboptimal baselines is recognized, with entropy-regularized baselines proposed to mitigate directional artifacts in explanations (Lu, 17 Feb 2025).
- Unification with fANOVA and Generalized Additive Models: The landscape of feature attribution methods is subsumed within a unified framework combining functional ANOVA decomposition with Shapley-based partial/fair allocation, showing that high-order SHAP recovers the unique GAM of order (Fumagalli et al., 2024, Bordt et al., 2022).
7. Open Challenges and Future Directions
Open directions include:
- Efficient high-order estimation: Scaling unbiased interaction estimation to in high dimensions remains a challenge, motivating further algorithmic and theoretical advances (Fumagalli et al., 2023).
- Interpretability and human alignment: Persistent gaps between model-faithful attributions and human rationales suggest the need for hybrid, user-centric explanation strategies that combine Shapley guarantees with semantically coherent interaction grouping (Sun et al., 2024, Lu, 17 Feb 2025).
- Structure-awareness: Integration of domain or task-specific priors (e.g., graph connectivity, span coherence) further enhances explainability and model alignment (Bui et al., 2024).
- Rigorous axiomatics: The study of axiomatic foundations for new indices (e.g., Myerson–Taylor) and for decomposed attributions (e.g., dependence-aware Shapley) continues to refine the scope and trustworthiness of model explanations (Tsai et al., 2022, Bui et al., 2024).
Shapley-based and interaction-focused explanations, backed by deep axiomatics and flexible algorithmic frameworks, offer an extensive and evolving toolkit for the faithful, interpretable, and actionable analysis of black-box models across modalities, domains, and model architectures.