Stochastic LLM Shapley Attribution

Updated 28 February 2026

Shapley attribution is a principled method from cooperative game theory that quantifies contributions of LLM components under inherent stochasticity.
It employs efficient estimation algorithms like Monte Carlo sampling and DAG-Shapley to manage computational costs in complex multi-agent and modular systems.
The approach enhances transparency by rigorously attributing contributions from chain-of-thought steps, tools, and training data, aiding model explainability.

Shapley attribution for stochastic LLM decision support formalizes the problem of explaining and quantifying the contribution of individual model components, prompt elements, data points, modules, agents, or external tools under non-deterministic inference. Grounded in cooperative game theory, Shapley-based approaches enable rigorous attributions despite the stochasticity inherent in LLM decoding or system architecture. Modern implementations span chain-of-thought (CoT) reasoning, multi-agent workflows, retrieval-augmented generation, training data auditing, and probabilistic forecasting.

1. Theoretical Foundations and Stochastic Extension

The Shapley value provides a unique, axiomatic solution for distributing total system performance (utility) across a set of “players”—which may index prompt tokens, reasoning steps, tools, modules, or agents. For a given cooperative game with value function $v(S)$ , where $S$ is a subset of players, the Shapley value for player $i$ is:

$\phi(i) = \sum_{S \subseteq N \setminus \{i\}} \frac{|S|!(n - |S| - 1)!}{n!} [v(S \cup \{i\}) - v(S)]$

Classically, $v(S)$ is deterministic. In stochastic LLM settings, $v(S)$ corresponds to an expected reward, such as log-likelihood, semantic similarity, or system performance, averaged over the model’s or environment’s randomness (Naudot et al., 3 Nov 2025, Xin et al., 20 Sep 2025).

Three core Shapley axioms—efficiency, symmetry, and the null-player property—provide fairness guarantees. Under unbiased stochastic estimation (e.g., averaging multiple generations per $S$ ), these axioms hold in expectation (Naudot et al., 3 Nov 2025), with practical violations possible in finite sample regimes or without careful caching.

2. Efficient Estimation and Approximation Algorithms

Exact Shapley computation requires evaluating all $2^n$ ( $n!$ ) subsets, prohibitive for all but the smallest $n$ . Modern literature proposes a variety of efficient, unbiased (or approximately unbiased) estimation algorithms:

Stratified Sampling (SalaMA): For attribution over CoT reasoning steps, SalaMA samples across insertion positions and coalitions, caching results, achieving $S$ 0 complexity and unbiased estimates, with variance controlled by sample size $S$ 1 (Xin et al., 20 Sep 2025).
Monte Carlo Permutation Sampling: Draw random permutations of players, accumulate marginal contributions of each player upon entering the coalition, and average, for $S$ 2 calls. Caching further reduces cost, and variance decreases as $S$ 3 (Naudot et al., 3 Nov 2025, Horovicz, 14 Dec 2025).
Coalition Pruning and DAG-Shapley: In modular or agent workflows structured as directed acyclic graphs, pruning non-viable coalitions and hierarchical memoization yield exact Shapley values at orders-of-magnitude lower cost (Xia et al., 6 Dec 2025).
Surrogate Regression (Kernel SHAP): In RAG/LLM document attribution, fit weighted linear regression to coalition utility values, mapping presence indicators to predicted performance; coefficients approximate Shapley values with high sample efficiency (Nematov et al., 6 Jul 2025).

Table: Core Algorithms for Shapley Estimation in Stochastic LLMs

Algorithm	Complexity	Contexts
Exact Enumeration	$S$ 4	Tiny $S$ 5; gold-standard for validation
SalaMA	$S$ 6	CoT expression attribution (Xin et al., 20 Sep 2025)
Monte Carlo	$S$ 7	General feature or tool attribution
DAG-Shapley	$S$ 8	Multi-agent DAGs (Xia et al., 6 Dec 2025)
Kernel SHAP	$S$ 9 LLM calls	Document/source attribution (Nematov et al., 6 Jul 2025)

Cache utilization, per-coalition repeated sampling, and stratified selection are widely employed for additional variance and cost control.

3. Attribution Targets and Value Functions

Shapley attribution applies to a spectrum of entities within stochastic LLM-based decision support:

Chain-of-Thought Steps: Each mathematical or logical expression in a reasoning chain is treated as a player. $i$ 0 typically couples the model's step-wise log-probability confidence with output correctness, crediting only truly useful reasoning fragments (Xin et al., 20 Sep 2025).
Tools and APIs: When LLM agents invoke external tools, the set of tool options forms the player set. $i$ 1 quantifies, for each subset, the semantic quality or similarity of the LLM output when restricted to those tools, with Shapley-based “tool importance” scores guiding tool selection and debugging (Horovicz, 14 Dec 2025).
Features, Factors, Prompt Spans: In explainability contexts, players are input features or sectors. Utilities may be model scores, probabilities, or log-probs, and attributions are interpreted as feature importances (Naudot et al., 3 Nov 2025, Nan et al., 14 Jan 2026).
Modules, Agents, Nodes: For modular agents and multi-agent systems, each module or agent is a node, and $i$ 2 represents the aggregated system reward when only $i$ 3 is active, under randomized or stochastic LLM executions (Yang et al., 1 Feb 2025, Xia et al., 6 Dec 2025, Yang et al., 11 Nov 2025).
Training Instances: In instance-attribution scenarios, Shapley values identify the influence of each training example on test accuracy, with value function $i$ 4 (the accuracy on a held-out set for training subset $i$ 5) made practical via fine-tuning-free NTK surrogates (FreeShap) (Wang et al., 2024).

The stochasticity of the LLM (e.g., sampling, temperature, API randomness) is addressed by repeatedly sampling per $i$ 6 and using the mean for $i$ 7 (Xin et al., 20 Sep 2025, Xia et al., 6 Dec 2025).

4. Evaluation Metrics and Covariate Alignment

To synthesize attribution vectors into actionable scalar metrics or guidance, specialized evaluation constructs are introduced:

CoSP Metric (Cardinality of Shapley Positives): Counts the number of elements with positive average Shapley values (possibly with a penalty for negatives), yielding a metric closely aligned with LLM model performance. Covariance theorems ensure monotonic correspondence with achieved accuracy, even under stochastic Monte Carlo estimation (Xin et al., 20 Sep 2025).
Performance Correlation: Across benchmarks, Shapley attributions and their surrogates are evaluated via rank-correlation ( $i$ 8), precision@k, and direct error reduction upon removals or substitutions (e.g., success rate drop if top-attributed tool/module is pruned) (Horovicz, 14 Dec 2025, Nematov et al., 6 Jul 2025, Wang et al., 2024).
Additivity and Decomposition: The Shapley decomposition allows sum-reconstruction of intermediate scores (as in PRISM's probability decomposition), providing interpretable rationale for outputs (Nan et al., 14 Jan 2026).

In multi-agent and modular settings, covariance analysis between overall system reward and aggregated Shapley statistics validates the faithfulness of the method.

5. Practical Considerations and Best Practices

Across attribution modalities, several practical guidelines are substantiated:

Monte Carlo Sample Size (m, T): Select sufficiently large $i$ 9 for subset/permutation sampling and $\phi(i) = \sum_{S \subseteq N \setminus \{i\}} \frac{|S|!(n - |S| - 1)!}{n!} [v(S \cup \{i\}) - v(S)]$ 0 for output averaging. Empirical variance decays as $\phi(i) = \sum_{S \subseteq N \setminus \{i\}} \frac{|S|!(n - |S| - 1)!}{n!} [v(S \cup \{i\}) - v(S)]$ 1; typical values are $\phi(i) = \sum_{S \subseteq N \setminus \{i\}} \frac{|S|!(n - |S| - 1)!}{n!} [v(S \cup \{i\}) - v(S)]$ 2– $\phi(i) = \sum_{S \subseteq N \setminus \{i\}} \frac{|S|!(n - |S| - 1)!}{n!} [v(S \cup \{i\}) - v(S)]$ 3, $\phi(i) = \sum_{S \subseteq N \setminus \{i\}} \frac{|S|!(n - |S| - 1)!}{n!} [v(S \cup \{i\}) - v(S)]$ 4– $\phi(i) = \sum_{S \subseteq N \setminus \{i\}} \frac{|S|!(n - |S| - 1)!}{n!} [v(S \cup \{i\}) - v(S)]$ 5 (Xin et al., 20 Sep 2025, Horovicz, 14 Dec 2025).
Caching and Memoization: Always employ result caching to amortize costs, particularly when coalition overlap is high (Naudot et al., 3 Nov 2025, Xia et al., 6 Dec 2025).
Coalition Pruning: In structured agent graphs, prune non-viable coalitions to reduce redundancy without loss of attribution accuracy (Xia et al., 6 Dec 2025).
Convergence Diagnostics: Monitor running averages and standard errors of $\phi(i) = \sum_{S \subseteq N \setminus \{i\}} \frac{|S|!(n - |S| - 1)!}{n!} [v(S \cup \{i\}) - v(S)]$ 6 estimates, stopping when desired accuracy is reached (Horovicz, 14 Dec 2025).
Visualization: Visualize sorted attribution vectors via color-coded bar plots, overlays on CoT steps, or heatmaps over modules/tools. Negative Shapley values should be highlighted for user review or pruning (Xin et al., 20 Sep 2025, Horovicz, 14 Dec 2025).
Robustness: True Shapley attributions are more sign-robust under dataset resampling or stochastic inference than leave-one-out or counterfactual proxies (Wang et al., 2024).

6. Empirical Insights and Application Case Studies

Empirical research validates Shapley-based stochastic attribution across a wide range of LLM decision-support deployments:

Mathematical CoT Optimization: SalaMAnder demonstrates that optimizing for high positive CoSP scores reliably aligns with higher few-shot CoT accuracy. Visualization of per-step Shapley values enables prompt refinement by pruning negative or neutral steps (Xin et al., 20 Sep 2025).
Agent and Tool Attribution: AgentSHAP successfully isolates essential tools in LLM agents, achieving high consistency and faithfulness, as confirmed by tool-removal ablations on API-Bank (Horovicz, 14 Dec 2025).
Data Auditing via Instance Attribution: FreeShap robustly resolves helpful versus harmful data points under perturbation, outperforming leave-one-out in data removal, selection, and wrong-label detection tasks (Wang et al., 2024).
Retrieval-Augmented Generation: Kernel SHAP yields top-k precision matching exact Shapley >0.8 with a moderate computational budget, while TMC and Beta-Shapley provide alternatives for variance-constrained applications (Nematov et al., 6 Jul 2025).
Probability Estimation: PRISM reconstructs LLM-calibrated class probabilities from per-factor Shapley marginals, enhancing AUROC and calibration versus direct prompting in healthcare and finance domains (Nan et al., 14 Jan 2026).
Multi-Agent Optimization and Online System Improvement: HiveMind’s DAG-Shapley enables live, per-agent contribution analysis in financial trading, driving prompt optimization cycles. DAG-Shapley reduces LLM API calls by over 80% with no loss in ranking accuracy (Xia et al., 6 Dec 2025).
Reinforcement Learning Credit Assignment: SHARP integrates Shapley-based marginal credit as a core policy gradient reward, fostering stable, fine-grained reinforcement learning in complex agent systems, substantially outperforming baselines in accuracy and learning stability (Li et al., 9 Feb 2026).

7. Limitations, Extensions, and Research Directions

Several open challenges and extensions occupy current research:

Approximation-Accuracy Trade-offs: Larger player sets ( $\phi(i) = \sum_{S \subseteq N \setminus \{i\}} \frac{|S|!(n - |S| - 1)!}{n!} [v(S \cup \{i\}) - v(S)]$ 7) necessitate approximate methods, which may violate Shapley axioms (efficiency, symmetry) under aggressive sampling or windowing (Naudot et al., 3 Nov 2025).
Pairwise and Higher-Order Interactions: Classic Shapley values do not capture synergistic or antagonistic interactions between players; extensions like Shapley interaction indices can address this (Horovicz, 14 Dec 2025, Nematov et al., 6 Jul 2025).
Correlated Features or Agents: Shapley assumes player marginality is meaningful; high correlation may bias attributions, suggesting the need for causal or conditional extensions (Nan et al., 14 Jan 2026).
Scalability: For very high dimensionality (e.g., document-level RAG with $\phi(i) = \sum_{S \subseteq N \setminus \{i\}} \frac{|S|!(n - |S| - 1)!}{n!} [v(S \cup \{i\}) - v(S)]$ 8), pipeline pruning, two-stage attribution, or surrogate modeling is needed to maintain tractability (Nematov et al., 6 Jul 2025).
Principle-Aware Monte Carlo Schemes: Enforcing efficiency or other axioms in approximate settings by constrained sampling or reweighting is a nascent area (Naudot et al., 3 Nov 2025).
Explainability in Workflow Structures: For pipeline or DAG-structured systems, efficient attribution algorithms exploiting architectural priors (such as DAG-Shapley) enable practical real-world optimization (Xia et al., 6 Dec 2025).

A plausible implication is that as LLM decision-support systems grow in size and complexity, scalable, theoretically sound, and noise-robust Shapley-based attributions will become foundational for both automated prompt engineering and the deployment of transparent, auditable machine reasoning pipelines.