Shapley-Value Decompositions

Updated 16 March 2026

Shapley-value decompositions are mathematical frameworks that allocate contributions fairly using axioms from cooperative game theory.
Structural and algorithmic methods, including convex block decompositions and recursive attribution, reduce the exponential complexity of exact computations.
They power diverse applications from feature attribution in machine learning to fair data valuation in economic and database query contexts.

Shapley-value Decompositions

Shapley-value decompositions are mathematical frameworks for distributing a value, cost, or responsibility among a set of contributors (players, features, data points, or sources) in a principled, axiomatic way. Originating from cooperative game theory, the canonical Shapley value assigns unique additive attributions to each player based on their average marginal contribution across all possible coalition orderings. The concept has led to a spectrum of decomposition methodologies spanning combinatorial games, functionals over high-dimensional spaces, statistical models, data markets, database queries, and probabilistic machine learning models.

1. Foundations: Shapley Value and Decomposition Principle

The Shapley value, introduced by Lloyd Shapley (1953), defines an additive decomposition for any set function $v:2^N \to \mathbb R$ (with $N$ the set of players). For each $i\in N$ , the value:

$\phi_i(v) = \sum_{S \subseteq N\setminus\{i\}} \frac{|S|! (|N|-|S|-1)!}{|N|!} \; [v(S \cup \{i\}) - v(S)]$

is the expectation over uniformly random addition orderings of the marginal contribution of $i$ . The Shapley value is characterized by the four axioms: efficiency ( $\sum_i \phi_i = v(N)$ ), symmetry, dummy (null player), and linearity.

In function decomposition, the Shapley construction applies not only to finite games but also to functions $F:\mathbb R^n \to \mathbb R$ , via pointwise extension. For arbitrary functions, the "pointwise Shapley decomposition" is uniquely specified by nine functional axioms including additivity, permutation invariance, null player, linearity, pointwise and parameter continuity, and coordinate-wise re-parameterization invariance. This leads to the unique formula:

$\Delta_i(F)(x) = \sum_{S \subseteq [n]\setminus\{i\}} \frac{|S|!(n-|S|-1)!}{n!}[F(x_{S \cup \{i\}}) - F(x_S)]$

covering all Borel-measurable functions, and reducing to the set-function Shapley value on $\{0,1\}^n$ (Christiansen, 2023). These principles underpin applications from feature attribution to risk allocation.

2. Structural and Algorithmic Decomposition Techniques

A critical challenge with Shapley decompositions is their exponential computational complexity: the naive algorithm for $n$ players requires $O(2^n)$ set-function evaluations. Multiple decompositional and algorithmic frameworks exploit problem structure to achieve tractable decomposition or efficient approximation.

2.1 Decomposability in Convex Games

In multiterminal data compression, the core of the coalitional cost game defined by the entropy function—i.e., the Slepian–Wolf region—is convex and submodular. When the joint source model factorizes into independent clusters (formalized as decomposability), the Shapley-value decomposition splits accordingly:

$v(S) = \sum_{k=1}^K v_k(S \cap N_k) \implies \phi(v) = \bigoplus_{k=1}^K \phi(v_k)$

where each $v_k$ pertains to one block of independent sources (Ding et al., 2018). The partition into blocks may be detected via a quadratic-time algorithm based on the Edmonds greedy procedure; subsequent Shapley computation is performed within each block, reducing the total cost to $O((n/m)2^m)$ for block size $m \ll n$ .

2.2 Recursive Function and Set Decomposition

The Shapley Sets framework recursively detects non-separable variable groups (NSVGs) in a predictive function, partitioning variables into minimal blocks such that within each block, features interact but between blocks they are separable. The Shapley value is then computed at the block level:

$f(x) = \sum_{j=1}^m f_{G_j}(x_{G_j}), \qquad \phi_{G_j} = v(G_j)$

This sharply avoids misallocation of joint effects in models or data with interaction structures, with time complexity $O(p \log p)$ , where $p$ is the input dimension (Sivill et al., 2023).

2.3 Hodge-theoretic and Spectral Decompositions

On the combinatorial hypercube, the Hodge decomposition splits the game gradient into player-wise components corresponding to the Shapley decomposition. Each component game arises via a least squares projection, and the value on the grand coalition reproduces the Shapley value (Stern et al., 2017).

For functionals of random variables, spectral decompositions—using polynomial chaos expansions (PCE)—enable rapid computation of extended Shapley-Owen interaction indices. The effects decompose as weighted sums over PCE coefficients against precomputed tables, reducing computational cost from $O(2^d)$ to a manageable dot product (Ruess, 2024).

3. Applications in Statistical and Machine Learning Contexts

3.1 Feature Attribution and Model Explanation

Shapley decompositions underpin model-agnostic explainability via attribution of model outputs to input features. The method applies to both scalar predictions (standard SHAP) and more complex outputs such as multiclass probabilities, where Shapley compositions ensure attributions remain valid on the simplex by leveraging Aitchison geometry (Noé et al., 2024).

For dependent features, explicit methods for conditional expectations are required; variational autoencoders with arbitrary conditioning (VAEAC) have demonstrated robust performance for estimating conditional distributions in both continuous and mixed-type data (Olsen et al., 2021).

3.2 Data Valuation and Market Design

In data marketplaces, Shapley-value decompositions provide principled pricing and fair revenue allocation by quantifying a data point's or owner's marginal contribution to model utility. When data utility is additive over records and each record is independently generated (independent utility), the Shapley value “decouples” into per-tuple attributions, and further reduction is gained by solving small, per-record subproblems, yielding large exact-computation speedups over enumeration and Monte Carlo (Luo et al., 2022).

Distributional Shapley values generalize data valuation to points outside the dataset by taking expectations over random draws, inheriting the full classical axioms and guaranteeing statistical stability under perturbations and distribution shift. Fast estimation via importance sampling and regression provides orders-of-magnitude speedup against previous methods (Ghorbani et al., 2020).

3.3 Database Query Analysis

In database query evaluation (e.g., conjunctive and aggregate queries), the Shapley value quantifies each tuple's contribution to the final query outcome. The dichotomy result for CQs shows that Shapley-value computation is polynomial-time for hierarchical queries without self-joins, but #P-hard otherwise; practical approximation algorithms via random-order sampling (FPRAS) are effective for general CQs (Livshits et al., 2019).

3.4 Probabilistic and Function Decomposition

Variance-based Shapley decompositions (Shapley effects) provide a distribution-free method for measuring feature importance, overcoming challenges faced by classical ANOVA indices in dependent or “hole-ridden” distributions. The allocation is always nonnegative and sums to total variance without restrictive assumptions (Owen et al., 2016).

The PDD-SHAP algorithm uses a truncated functional ANOVA (partial dependence decomposition) to approximate black-box models by low-order surrogates, yielding Shapley values orders of magnitude more efficiently than exact or sampling-based methods when feature interactions are limited (Gevaert et al., 2022).

4. Extensions: Beyond Exchangeability and Standard Utilities

Recent research has expanded Shapley decompositions to address structured dependencies among contributors.

4.1 Precedence and Priority-Aware Shapley

The Priority-Aware Shapley Value (PASV) generalizes the classical value by incorporating both hard precedence constraints (specified as a partial order/DAG) and soft, player-specific priority weights. PASV uniquely satisfies a set of axioms and averages over orderings (linear extensions) consistent with the precedence, weighting each by stage-dependent choices. Monte Carlo estimation is made scalable via an adjacent-swap Metropolis-Hastings sampler. Analysis of limiting regimes (priority parameters $\lambda_i \to 0$ or $\infty$ ) interprets PASV as interpolating between precedence-only and weighted Shapley variants (Lee et al., 10 Feb 2026).

4.2 Fairness and High-order Interactions

The Shapley-Owen effect provides an axiomatic decomposition with fair attribution of variance explained to higher-order coalitions, uniquely characterized by recursivity, block symmetry, and the classic Shapley properties. A spectral decomposition via PCE breaks the computational bottleneck, allowing model-specific part to scale with the number of important coefficients while leveraging a precomputed, model-independent table (Ruess, 2024).

5. Practical Considerations, Limitations, and Empirical Behavior

Multiple empirical studies across synthetic and real-world settings attest to the conceptual and computational benefits of structured Shapley-value decompositions:

Decomposable games enable exponential savings for large multiterminal systems, with empirical runtimes closely tracking theoretical predictions (Ding et al., 2018).
On function attribution tasks, PDD-SHAP achieves $R^2 \geq 0.85$ at two-way truncation while incurring an order-of-magnitude reduction in cost, and VAEAC-based conditional Shapley consistently improves explanation fidelity for dependent or mixed data over independence-based surrogates (Olsen et al., 2021, Gevaert et al., 2022).
In data valuation, independent utility and distributional Shapley frameworks yield scalable, fair, and robust value decompositions with high accuracy and minimal compute, even for millions of tuples (Luo et al., 2022, Ghorbani et al., 2020).
For fairness analysis in risk or capital allocation, spectral Shapley-Owen methods converge rapidly and provide explicit error bounds—critical for high-dimensional, polynomial-expansion models (Ruess, 2024).

However, intrinsic limitations remain: exponential complexity persists when no structural decomposability or sparsity is present; the curse of dimensionality affects spectral methods for high feature counts without strong regularity; per-record synopsis can become intractable for dense or highly overlapping tuple syntheses. The choice of value functional (e.g., conditional versus marginal expectation) is context-sensitive and can meaningfully alter the resulting attributions (Michiels et al., 2023). Approximations via random sampling remain essential tools when full enumeration is infeasible.

Overall, Shapley-value decompositions comprise a foundational and unifying framework for the principled, efficient allocation of value, cost, or responsibility in complex systems, with a growing repertoire of structural adaptations and domain-specific algorithms across information theory, statistics, machine learning, database theory, and fairness analysis.