Joint Shapley Values: A Generalized Attribution Framework
- Joint Shapley Values are attribution measures that generalize classical Shapley values to quantify the joint contribution of coalitions with complex interdependencies.
- They extend fairness axioms—such as joint efficiency, null, and symmetry—to yield unique and consistent value attributions for any subset of agents or features.
- They are practically applied in machine learning attribution, data valuation, and information decomposition, employing methods like Monte Carlo sampling and regression adjustments for tractability.
Joint Shapley Values are a class of attribution measures that generalize the classical Shapley value from cooperative game theory to quantify the joint contribution of sets (coalitions) of agents, features, or data items in contexts with complex interdependencies, externalities, or structured interactions. While the individual Shapley value provides the average marginal effect of a singleton agent over all possible orders, joint Shapley values systematically extend the attribution to arbitrary sets, capturing synergistic, antagonistic, or cancellative effects among them. These joint attributions are grounded in axiomatic extensions of Shapley’s fairness conditions and often exhibit uniqueness under suitably defined constraints.
1. Mathematical Formulation and Axiomatic Foundations
Joint Shapley Values (JSV) are constructed as axiomatic extensions of the classical Shapley value. For a set N of n agents/features and a cooperative game v: 2N → ℝ (with v(∅) = 0), the classical Shapley value for i ∈ N is
which averages marginal contributions over all permutations. The Joint Shapley Value for any coalition T ⊆ N generalizes this to
where the weights q_s are determined by extended axioms:
- Joint Linearity (JLI): Linearity in v,
- Joint Null (JNU): Zero value for null coalitions,
- Joint Efficiency (JEF): Total value v(N) partitioned among all coalitions,
- Joint Anonymity (JAN): Attribution invariant under relabelings,
- Joint Symmetry (JSY): Coalitions with identical effects receive identical values.
The sequence of weights q_s is uniquely specified for each order of explanation k (max coalition size) by a well-defined recurrence, yielding uniqueness of the JSV (Harris et al., 2021). For k = 1, the JSV recovers the classical (singleton) Shapley value.
2. Extension to Partition-Function Games and Externalities
In partition-function games, the value of a coalition S depends on the partition P of the rest of the agents, capturing externalities. The marginal contribution of an agent i may be ambiguous because it varies depending on which coalition i joins after leaving S. The generalization is formalized by introducing elementary marginal contributions:
where α_i is a symmetric, normalized, non-negative weighting over transfers. Shapley’s classical axioms, when “lifted” to this setting, yield a unique joint value for any weighting scheme (Skibski et al., 2013). Values by Macho-Stadler et al., McQuillin, and Bolger correspond to particular choices of α.
A Monte Carlo algorithm is provided to approximate joint Shapley values under arbitrary α via sampling permutations and partitions, yielding unbiased, consistent estimators with quantifiable error bounds.
3. Interpretations and Comparison with Interaction Indices
While interaction indices in attribution frameworks decompose effects recursively over elements or pairs and adjust for "interactions", joint Shapley values directly attribute the marginal effect of a coalition T by averaging v(S ∪ T) – v(S) over all possible coalitions S. This direct extension of axiomatic fairness ensures:
- Null coalitions receive zero (extended null axiom).
- Joint efficiency partitions the total outcome over coalitions.
- Coalitions’ joint effect is assessed independently of the sum of constituent singleton effects.
For instance, Joint Shapley Values assign positive value to both singletons and coalitions when the effect is present, unlike some indices which may assign negative values in case of cancellation (Harris et al., 2021).
4. Computational Methods and Approximations
Joint Shapley values are challenging to compute exactly, due to the combinatorial growth with n. Several methods address tractability:
- Regression-Adjusted Monte Carlo: A surrogate function f (often an efficiently computable tree-based model) is fitted to approximate v, and the joint Shapley value is decomposed as ϕ_J(v) = ϕ_J(f) + ϕ_J(v – f), with the residual term estimated via Monte Carlo sampling. Indicator functions are used to select coalitions containing J, and sample reuse across all features/groups optimizes computation. The estimator is unbiased and enjoys variance reduction when f is well-fitted (Witter et al., 13 Jun 2025).
- Paired-Sampling Approximations: By pairing random coalition samples with their complements (KernelSHAP) or permutations with their reversals (PermutationSHAP), variance is reduced. For value functions restricted to pairwise (maximal order-2) interactions, a single paired sample can yield exact Shapley values, and the paired PermutationSHAP estimator exhibits the additive recovery property for modular value functions, which the kernel version lacks (Mayer et al., 18 Aug 2025).
- Monte Carlo Approximations with Externalities: Sampling both agent permutations and partitions under the chosen α weighting provides unbiased approximations of joint Shapley values in partition-function games with externalities, including error control via variance bounds (Skibski et al., 2013).
5. Functional Decomposition and Generalized Additive Models
n-Shapley Values extend classical Shapley attributions to include interaction terms up to order n. For |S| ≤ n, the n-Shapley value for S is recursively defined by
with Bernoulli numbers B_k and Shapley interaction indices Δ_S. As n grows to d, the decomposition recovers a full generalized additive model (GAM) with higher-order interactions (Bordt et al., 2022). For observational SHAP, n-Shapley values can characterize the contributions only under independence; otherwise, interventional SHAP is required for validity.
6. Applications in Attribution, Data Valuation, and Information Decomposition
Joint Shapley values are increasingly applied across fields:
- Machine Learning Attribution: Explain model predictions by assigning contributions to feature sets, not just single features. Presence-adjusted global values are introduced for binary features, ensuring local consistency when features are absent (Harris et al., 2021). Multicollinearity adjustments "decorrelate" feature groups before calculating joint values (Basu et al., 2020).
- Database Management: Joint contributions of tuples to query answers and inconsistency measures are quantified, with practical solutions including reductions to probabilistic query evaluation and Monte Carlo approximation schemes (Bertossi et al., 11 Jan 2024, Livshits et al., 2019).
- Dataset Valuation: Efficient proxies (e.g., DU-Shapley) exploit structure in utility functions, drastically reducing computational complexity in federated or collaborative learning scenarios (Garrido-Lucero et al., 2023).
- Information Decomposition: Shapley-style (random-order) values decompose mutual information or utility among predictor sets in a coalitional game defined on a boolean algebra of predictors, respecting hierarchical structures and yielding nonnegative attributions (Kroupa et al., 2022).
7. Extensions with Hodge Theory and Cooperative Networks
Joint Shapley Values have been further extended via combinatorial Hodge theory, decomposing games into component games per player, with allocations to every coalition state—not just the grand coalition—via least squares solutions on the coalition hypercube. Stochastic path integral representations link these allocations to expectations over coalition formation processes driven by Markov chains. Modified axioms (notably a reflection axiom) uniquely characterize the extended value allocations, providing a richer framework for cooperative networks and dynamic coalition processes (Lim, 2022, Lim, 2021).
Joint Shapley Values unify and generalize attribution principles for cooperative situations, partition-function games with externalities, machine learning model explanations, database query responsibility, information decomposition, and resource-sharing problems. They are mathematically grounded in extended axiomatic systems, computationally tractable via regression and sampling enhancements, and uniquely suited to analyze the joint effects of sets—providing a rigorous and flexible foundation for both theoretical and practical tasks in many domains.