Shapley Value Analysis

Updated 29 December 2025

Shapley values are defined by axioms such as efficiency, symmetry, dummy, and additivity for fairly allocating payoffs among players or features.
Efficient estimation frameworks—including sampling, amortization, and sketching—mitigate exponential costs, enabling practical application in high-dimensional models.
The analysis extends to causal attribution and interaction indices, enhancing interpretability and robustness in domains like machine learning and database query evaluation.

The Shapley value is a foundational concept in cooperative game theory, providing a unique, axiomatically justified method for fairly distributing the total payoff among agents or assigning importance scores to features or data points in machine learning. However, its exact computation scales exponentially with the number of entities, necessitating both theoretical advances and algorithmic innovations for practical analysis, estimation, and interpretation. Recent research has leveraged structure in the value function, efficient sampling and amortization, generalizations for weighted coalitions, domain-specific modeling, and a deeper connection to causality, interaction indices, and interpretable models.

1. Definition, Properties, and Structural Generalizations

The Shapley value is defined for a cooperative game $(N,v)$ , where $N$ is a set of $n$ players (features, data points, tuples, etc.), and $v: 2^N \to \mathbb{R}$ is a set function satisfying $v(\varnothing)=0$ . The Shapley value for player $i$ is:

$\phi_i(v) = \sum_{S \subseteq N \setminus \{i\}} \frac{|S|!(n - |S| - 1)!}{n!} \left[ v(S \cup \{i\}) - v(S) \right]$

This formula uniquely satisfies efficiency, symmetry, dummy, and additivity. In the context of high-dimensional or structured games, approximation and generalization become critical:

k-Additive Games: Any value function $v$ admits a Möbius decomposition $v(S) = \sum_{A \subseteq S} m(A)$ . If $m(A)=0$ for $|A|>k$ (k-additive), then the number of free parameters is $\sum_{j=0}^k \binom{n}{j}$ , and the Shapley value of the k-additive surrogate is readily computable in closed form (Pelegrina et al., 7 Feb 2025).
Weighted Shapley Values: Generalize the classical weighting across subset cardinalities. Let $w: \{0, ..., n-1\} \to \mathbb{R}_+$ , then the weighted Shapley value is

$\phi_j^w(v) = \sum_{S \subseteq N \setminus \{j\}} w(|S|)\left[v(S \cup \{j\})-v(S)\right]$

supporting more semantically aligned attributions, and can be estimated efficiently via a weighted least squares characterization (Panda et al., 9 Mar 2025).

2. Efficient Shapley Value Estimation: Frameworks and Algorithms

The exponential cost of enumerating all $2^n$ coalitions has motivated a taxonomy of estimators:

Sampling-based Approaches: Approaches based on random permutation (Castro et al.), stratified sampling (by size or structure), and advanced techniques such as multilinear extension (Owen sampling) achieve unbiased estimation with controlled variance. Multilinear extension integrates over random subset size distributions for improved convergence (Okhrati et al., 2020).
Amortized and Surrogate Models: Amortized models regress precomputed/shapley-supervised attributions onto fast-to-compute features (e.g., neural embeddings), yielding deterministic, orders-of-magnitude faster estimates. This framework is empirically validated in text classification (60x speedup over KernelSHAP) (Yang et al., 2023) and general feature attribution/data valuation (Panda et al., 9 Mar 2025).
Least Squares Sketching: The Shapley solution arises as the minimizer of a weighted linear system, enabling estimators based on random sketching matrices. This formalization yields the first non-asymptotic guarantees for estimators such as KernelSHAP and LeverageSHAP, with provable sample complexity (Chen et al., 5 Jun 2025).

Methodology	Core Idea	Computational Gain
k-Additive	Fit low-order model	Poly(n) if $k$ fixed
Multilinear	Integrate over $q$	Lower variance
Amortized	Learned regression	10x–100x speedup
Sketching	Random LS sketching	Sample complexity

3. Domain-Specific Modeling and Generalizations

Shapley analysis extends across diverse domains, requiring adaptation to context:

Database Tuples and Queries: In data management, the Shapley value quantifies tuple/record contribution to relational or aggregate query answers, with exact tractability characterized by structural query properties (e.g., hierarchy classes for conjunctive queries). Approximation algorithms include FPRAS and advanced stratified sampling (RSS/ARSS), which stratify by relation-wise tuple counts and adaptively allocate samples to high-variance strata, dramatically improving error and variance in join-heavy relational workloads (Livshits et al., 2019, Standke et al., 16 Sep 2025, Alizad et al., 27 Nov 2025).
Reinforcement Learning: SVERL and Counterfactual Shapley Values operationalize state-features as players, with characteristic functions defined to reflect policy shifts and counterfactual outcome differences between actions. These methods address performance explanation and action selection transparency, outperforming "off-the-shelf" SHAP which cannot model behavior shifts (Beechey et al., 2023, Shi et al., 2024).
Data Valuation: Data Shapley and its derivatives (e.g., CS-Shapley, DShapley, Counterfactual explanations for coalitions) rigorously quantify individual datum utility for learning. Class-wise extensions provide in-class and out-of-class contribution separation (Schoch et al., 2022), while distributional perspectives exploit model structure for closed-form analytic solutions in regression, classification, and density estimation, scaling to $10^5$ points (Kwon et al., 2020). Counterfactual explanations for Shapley orderings are formulated as subset-minimization problems which are NP-hard, with practical greedy heuristics for scalability (Si et al., 2 Jul 2025).

4. Interpretability, Attribution, and Causal Perspective

Shapley value interpretation is sensitive to the definition of the value function and underlying feature/statistical assumptions:

Causal Analysis: Marginal (interventional) Shapley correctly interprets interventions and provides attributions that align with the effect of do-operations in the causal graphical model. By contrast, conditional (observational) Shapley can ascribe nonzero credit to dummy variables under correlation, violating desirable causal properties. Conditional Shapley is shown to be causally unsound and is not recommended when features are correlated (Rozenfeld, 2024).
Interaction Indices and GAM Correspondence: The n-Shapley family interpolates between classical additive Shapley and full functional decompositions, with higher orders (e.g., Shapley–Taylor, Shapley–Faith–Shap) recapturing unique GAM decompositions up to order n, and partial dependence plots revealing the influence of interactions (Bordt et al., 2022).
Robustness to Correlation: In the presence of linearly correlated features, correlation-adjusted Shapley approaches (matrix-based de-correlation) guarantee additivity and de-bias attributions, preserving the true effect size even in the presence of multicollinearity (Basu et al., 2020).

5. Empirical Evidence and Practical Considerations

Empirical studies validate methodological advances:

Approximation Quality: SVA $_k$ -ADD outperforms permutation and stratified sampling as soon as $k \geq 3$ , with both global and local explanation fidelity (measured via MSE versus exact Shapley) rapidly convergent with moderate sample budgets (Pelegrina et al., 7 Feb 2025).
Weighted Attribution: FW-Shapley achieves lower Inclusion AUC error and higher computational efficiency compared to FastSHAP and KNN-based data valuation (Panda et al., 9 Mar 2025).
Relational Sampling: ARSS in database queries consistently outperforms naive stratified or Monte Carlo via low-variance, join-aware strata and adaptive allocation, enabling estimation at interactive latencies for heavy analytical SQL workloads (Alizad et al., 27 Nov 2025).
Stability and Transferability: Amortized models provide stable, deterministic outputs in text classification and data valuation settings, and class-wise Shapley values transfer well between classifiers, aiding model-agnostic data-pruning decisions (Yang et al., 2023, Schoch et al., 2022).

6. Theoretical Insights and Limitations

Theoretical advances include:

Consistency and Error Bounds: For regression-based approximation (including weighted-SLS and sketching), convergence in sample size and provable finite-sample mean squared error bounds are established; optimization is strongly convex under mild weighting assumptions (Panda et al., 9 Mar 2025, Chen et al., 5 Jun 2025).
Tractability in Databases: Aggregate function choice determines tractable query classes for Shapley computation—e.g., sum/count over existential-hierarchical CQs, min/max/count-distinct over all-hierarchical CQs, average/quantile over q-hierarchical structures (Standke et al., 16 Sep 2025).
Causal Extensions: Asymmetric Shapley relaxes symmetry to incorporate ordering (permutation) constraints, which, if unconstrained, can lead to counterintuitive attributions in the presence of non-additive or non-linear interactions. Restricting to GAMs recovers sensible variance-reduction attributions (Kelen et al., 2023).

7. Current Directions and Open Challenges

Recent trends and open problems include:

Higher-Order Interactions: Efficient computation and interpretation of n-Shapley and related higher-order indices for complex, high-dimensional models (Bordt et al., 2022).
Bayesian and Adaptive Estimation: Development of stratification/adaptation paradigms beyond relation- or size-based strata, and integration with online and active learning scenarios (Alizad et al., 27 Nov 2025).
Causal and Counterfactual Attribution: Further integration of causal graph structure into Shapley analysis (e.g., asymmetric or weighted variants informed by structural constraints) and refinement of counterfactual explanations in data coalitions (Si et al., 2 Jul 2025, Rozenfeld, 2024).
Scaling to Ultra-High Dimensions: Ongoing engineering and theoretical work targets model-agnostic, sample-efficient, and domain-specialized estimators for feature sets in tens of thousands or more.

The evolution of Shapley value analysis reflects an overview of theory, estimation, and domain modeling. Contemporary research unites axiomatic fairness, combinatorial and statistical efficiency, and the interpretability demands of modern machine learning, database analysis, and causal reasoning (Pelegrina et al., 7 Feb 2025, Chen et al., 5 Jun 2025, Alizad et al., 27 Nov 2025, Rozenfeld, 2024, Kelen et al., 2023).