Faithful Group Shapley Value (FGSV)

Updated 13 September 2025

Faithful Group Shapley Value (FGSV) is a group-level extension of the classical Shapley value that captures both individual contributions and synergistic interactions.
It incorporates axioms such as null player, symmetry, and faithfulness to ensure robustness against manipulations like strategic splitting of groups.
FGSV leverages merging games, regression techniques, and efficient approximation algorithms to enable practical applications in machine learning, explainable AI, and economic modeling.

The Faithful Group Shapley Value (FGSV) extends the classical Shapley value to quantify the contribution of groups of agents, data points, or features in cooperative game-theoretic and machine learning contexts. FGSV aims to faithfully attribute value at the group level, ensuring that joint prospects and interactions—including synergies—are captured, and that the valuation is robust against manipulations such as strategic splitting or regrouping. The concept arises from multiple lines of research, including axiomatic characterizations in coalitional games, group data valuation in machine learning, causal-aware explanations, and counterfactual economic decompositions.

1. Definition and Conceptual Foundations

FGSV generalizes the individual Shapley value in several frameworks. In the game-theoretic tradition, the Shapley group value is defined by merging the group $C$ into a single proxy player $c$ and evaluating its classical Shapley value in a modified game: $\varphi^g(C; N, v) = \varphi_c(N_C, v_C),$ where $N_C = (N \setminus C) \cup \{c\}$ and $v_C(S) = v(S)$ if $c \notin S$ , else $v(S \cup C)$ (Flores et al., 2014). FGSV is interpreted, in certain machine learning settings, as the sum of individual Shapley values over the group: $\text{FGSV}(S_0) = \sum_{i \in S_0} \text{SV}(i),$ where $\text{SV}(i)$ is the Shapley value for data point $i$ , enforced by a faithfulness axiom requiring invariance to opponent regrouping (Lee et al., 25 May 2025). In interaction index literature, FGSV arises as the unique solution to a weighted least squares regression fitting the best $\ell$ -order polynomial to the value function $v(\cdot)$ , distributing attributions to all subsets of features up to a prescribed order (Tsai et al., 2022). In structural economic applications, the group Shapley value is the unique constrained solution yielding additive importance of parameter groups, interpretable as contributions to counterfactual outcomes (Kwon et al., 9 Oct 2024).

2. Axiomatic Characterization

FGSV is uniquely specified by classical and group-oriented axioms:

Null Player: Adding a null element does not affect the group value.
Symmetry: Permuting labels of agents, data, or features does not change the attribution.
Linearity: The value is linear with respect to mixtures of games or utility functions.
Efficiency: The sum of all group values totals the worth of the grand coalition.
Faithfulness: Partitioning outside the group (including strategic splitting) does not influence the group’s valuation (Lee et al., 25 May 2025).

Additional group-specific axioms include Coalitional Balanced Contributions (G-CBC) and Symmetry over Pure Bargaining Games (G-SPB) (Flores et al., 2014).

These axioms enforce that group valuation not only aggregates individual contributions but also acknowledges synergy or complementarity within the group, differentiating FGSV from additive methods that ignore interaction effects.

3. Mathematical Formulation and Algorithms

In transferable utility games, the Shapley group value uses merging games: $\varphi^g(C; N, v) = \varphi_c(N_C, v_C),$ with $v_C(S)$ defined above, ensuring the calculation encompasses both individual and group-level impact (Flores et al., 2014).

FGSV as a sum of individual Shapley values is expressed: $\text{FGSV}(S_0) = \sum_{i \in S_0} \text{SV}(i),$ with

$\text{SV}(i) = \sum_{S \subseteq N \setminus \{i\}} \frac{|S|!(n - |S| - 1)!}{n!} [U(S \cup \{i\}) - U(S)],$

and FGSV further reformulated as: $\text{FGSV}(S_0) = \frac{|S_0|}{n}[U(N) - U(\emptyset)] + \sum_{s=1}^{n-1} \mathcal{T}(s),$ where $\mathcal{T}(s)$ involves averaging over subset intersections and uses a hypergeometric weighting (Lee et al., 25 May 2025).

Efficient approximation algorithms exploit concentration properties of the hypergeometric distribution and paired Monte Carlo sampling, achieving low sample complexity and variance reduction under stability conditions (Lee et al., 25 May 2025). In structural models, constrained weighted least squares (WLS) formulations are used to estimate group values robustly, even with missing simulation results: $\min_{\{\varphi_M\}} \sum_{\Psi \subset \Pi} \{g(\cup_{M \in \Psi}M) - \sum_{M \in \Psi}\varphi_M\}^2 k(\Pi,\Psi), \quad \text{s.t. } \sum_M \varphi_M = g(P),$ with explicit kernel weighting and a unique closed-form solution (Kwon et al., 9 Oct 2024).

In the context of faithful interaction indices, the unique solution is obtained via minimizing a weighted regression: $\min_{E} \sum_{S \subseteq [d]} \mu(S)[v(S) - \sum_{T \subseteq S, |T| \leq \ell} E_T(v,\ell)]^2,$ subject to extended symmetry, efficiency, dummy, and linearity (Tsai et al., 2022).

4. Properties and Faithfulness Guarantees

FGSV guarantees:

Robustness to Manipulation: The faithfulness axiom ensures resistance to shell company attacks whereby adversaries split groups to inflate valuation (Lee et al., 25 May 2025).
Synergy Capture: Interaction effects are explicitly included, with marginal and second-order differences embedded in the valuation (Flores et al., 2014). The merging game setup, regression-based interaction allocation, and inclusion-exclusion analysis ensure synergy is not double-counted nor omitted (Tsai et al., 2022, Kwon et al., 9 Oct 2024).
Interpretability: The decompositions sum to the total utility change and are presented in forms analogous to regression coefficient tables, facilitating practical interpretation (Kwon et al., 9 Oct 2024).
Computational Efficiency: Approximation algorithms leverage subset-size aggregation, Taylor expansions, and paired Monte Carlo, achieving significant computational gains over naive summation approaches (Lee et al., 25 May 2025).

5. Applications in Machine Learning and Economics

FGSV is deployed in multiple domains:

Data Valuation: FGSV quantifies contributions of batches of data points, providing faithful copyright attribution and stable valuations in federated learning and collaborative data sharing (Lee et al., 25 May 2025). Replacing earlier GSV-based royalty formulas with FGSV eliminates sensitivity to adversarial data splitting.
Feature Selection and Explainability: FGSV-like interaction indices (e.g., Faith-Shap) yield efficient and faithful decomposition of model predictions, consistently outperforming post-hoc methods in both attribution accuracy and computational speed (Sun et al., 2023, Tsai et al., 2022).
Causal and Counterfactual Analysis: FGSV, implemented as group Shapley decompositions or causal-aware group attributions (CAGE extension), allows rigorous quantification of group-level effects in models with causal dependencies (Breuer et al., 17 Apr 2024). Interventions respect the underlying DAG, ensuring faithfulness in presence of statistical and causal feature dependencies.
Structural Economic Simulation: FGSV provides unique, additive decomposition of parameter changes in counterfactual simulations. Applications include attribution of differences in income distribution, capital misallocation, and policy analysis across countries or reform experiments (Kwon et al., 9 Oct 2024).

In network settings (e.g., terrorist networks), FGSV aids in identifying key groups for intervention, with experimental evidence that synergetic groups outperform additive aggregation of individually most influential agents (Flores et al., 2014).

6. Comparison with Additive and Synergistic Approaches

Simple additive group values that sum individual Shapley values fail to capture intra-group interactions and synergies, leading to potentially misleading or manipulable attributions (Flores et al., 2014, Lee et al., 25 May 2025). FGSV, through its faithfulness and group-specific axioms (balanced contributions, merging-game construction, interaction indices, and robust regression-based estimation), ensures that the group valuation is both resistant to manipulation and inclusive of joint effects.

Alternative group extensions, such as the Union Shapley Value, provide potential-based group attributions via collective removal, while dual synergistic indices measure only the surplus attributable to joint group action (Kępczyński et al., 27 May 2025). FGSV combines both: consistency with individual Shapley values, balanced handling of synergy, and invariance to external regrouping.

7. Open Directions and Limitations

Although FGSV addresses many longstanding challenges in group-level value attribution, limitations remain:

Computational Scalability: Despite efficient algorithms, exact computation scales poorly for very large datasets or feature sets; further research into advanced sampling, distributed computation, and leveraging problem structure is warranted (Lee et al., 25 May 2025).
Faithfulness in Overlapping or Hierarchical Groups: Extensions to hierarchical coalition structures or overlapping feature groups are active areas of investigation, as classical FGSV theory presumes non-overlapping groups (Rozemberczki et al., 2022).
Integration with Causal Inference and Model Dependence: Incorporation of causal relations and external knowledge—building on approaches like CAGE—could further enhance group faithfulness in practical ML and scientific applications (Breuer et al., 17 Apr 2024).
Interpretability and Impact Assessment: Translating FGSV attributions into actionable domain insights remains nontrivial, particularly in highly interactive or non-linear environments (Rozemberczki et al., 2022).

The Faithful Group Shapley Value provides theoretically principled, computationally efficient, and manipulationally robust solutions for group-level attribution in cooperative settings, machine learning, explainable AI, and economic modeling. Its axiomatic basis and synergy-aware construction offer meaningful improvements over additive group methods, with broad applicability in modern data-centric scientific domains.