Tool-Level Attribution in AI Systems
- Tool-level attribution is the quantitative process of analyzing external tools' impact on AI outputs, ensuring clear, fair, and interpretable contributions.
- It employs methodologies such as Shapley value calculation and Monte Carlo approximations to evaluate the importance and marginal benefit of each tool.
- This approach supports debugging, optimization, and regulatory compliance by transparently attributing tool contributions in LLM agents and other AI systems.
Tool-level attribution refers to the quantitative and qualitative analysis of how specific external tools, components, or systems contribute to the output or decision-making process of a primary agent, model, or workflow. In the context of LLMs and AI agents, tool-level attribution assigns fairness-grounded importance scores to each tool used during agent execution, interprets how and why tools impact final outputs, and quantifies the distinct contributions of these tools with principled methodologies. Tool-level attribution is critical for debugging, trust calibration, optimization, transparency, and social accountability across both technical and sociotechnical settings.
1. Problem Definition and Scope
Tool-level attribution addresses the challenge of assigning meaningful, fair, and interpretable credit to external tools invoked by an agent or system in generating a particular response to a task or prompt. This extends beyond simple invocation logs to address the actual impact or necessity of each tool in shaping the agent’s final output. In LLM agents, this problem arises because tool usage may not map linearly to importance, and existing explainability methods such as attention analysis or gradient-based attributions primarily target the agent’s internal representations, not its tool-mediated behavior (Horovicz, 14 Dec 2025).
Applications include:
- Cost and latency optimization: Removing or deprioritizing under-utilized or unnecessary tools.
- Debugging and trust: Verifying reliance on appropriate tools for specific task classes (e.g., ensuring calculators are used for arithmetic queries).
- Transparency and accountability: Enabling verifiable reporting and regulatory compliance in systems where tool or data provenance is critical.
- Sociotechnical signaling: In open-source development, tool-level attribution encompasses disclosure of AI assistants and the form of their crediting to manage perception and accountability (Kraishan, 30 Nov 2025).
2. Formal Frameworks for Tool-Level Attribution
Several formalizations exist for tool-level attribution, with recent work converging on cooperative game-theoretic principles to guarantee fairness and uniqueness.
2.1 Shapley Value Attribution
The Shapley value is the unique solution (up to fairness axioms) to the problem of quantifying each tool’s contribution in a setting where the agent , endowed with tool set , is viewed as a black-box function . Given an agent’s response for subset and a task-specific value function (e.g., the cosine similarity of to the full-tool response), the Shapley value is:
This value quantifies the marginal impact of adding tool to all possible tool subsets, averaged over all orderings (Horovicz, 14 Dec 2025, Alpay et al., 23 Sep 2025).
2.2 Monte Carlo Approximation
Since evaluating tool subsets is computationally prohibitive for moderate , modern frameworks use Monte Carlo estimators based on random permutations or subset sampling. Each permutation yields a marginal contribution for each tool at its entry position, and the average across permutations yields an unbiased estimate . Subset sampling enables further efficiency gains at controllable variance (Horovicz, 14 Dec 2025).
Pseudocode for permutation-based estimation:
1 2 3 4 5 6 7 |
for j in 1 ... m: π = random permutation of T for i in 1 ... n: S = {tools preceding t_i in π} Δ_i^π = v(S ∪ {t_i}) - v(S) φ_i += Δ_i^π φ_i /= m |
2.3 Alternative Attribution Formalisms
Some settings use bucketed distributional hypothesis testing (e.g., for LLM-generated code), statistical regression, or explicit symbolic program traces to attribute output to external tools, features, or document fragments (Canonne et al., 25 Jun 2025, Wan et al., 17 Jun 2025).
3. Methodological Instantiations and Evaluation
3.1 Black-Box Tool Attribution for LLM Agents
AgentSHAP, the first principled tool-importance attribution method for LLM agents, models the agent as a black-box responding to varying tool subsets. It uses cosine similarity between response embeddings as the value function, and computes tool-level Shapley values to produce interpretable vectors (Horovicz, 14 Dec 2025). This approach is model-agnostic and does not require internal gradient access.
- Experimental setup: Evaluations on API-Bank, a benchmark with annotated ground-truth tool usage, use metrics including Top-1 Accuracy, cosine similarity across runs, quality drop upon tool removal, and Shapley gap between relevant and irrelevant tools.
- Key findings: With 8 tools, AgentSHAP achieves high consistency (cosine similarity 0.945, Top-1 Accuracy 100% for three-tool cases), faithful identification of indispensable tools (quality drop 0.67 on removal), and clear separation between relevant and irrelevant tools ( gap in ). Cross-domain experiments confirm domain-appropriate tool attribution in 86% of cases.
3.2 Executable Text Attribution
GenerationPrograms formalizes the attribution process as executing explicit, modular text-operation programs that document, for each output sentence, the set of contributing tool outputs or source sentences at each step. The attribution set for output element is computed as the union of leaf nodes in the execution tree (Wan et al., 17 Jun 2025). This achieves 0% “no attribution” outputs and significantly improves F1 at document and sentence level compared to previous methods.
3.3 Throughput Decomposition in Systems
Explainable Throughput Decomposition (ETD) uses analogous Shapley principles to partition system throughput gains among internal components or tools. Each tool's contribution is defined by the marginal improvement in throughput when added to a coalition, with variance and error bounds derived from convexity properties and Hoeffding-type concentration (Alpay et al., 23 Sep 2025).
4. Social, Organizational, and Behavioral Dimensions
Tool-level attribution also encompasses the explicit or implicit acknowledgment of AI and software tool usage in collaborative or community contexts. In open-source software commits, attribution is operationalized as the presence, form, and explicitness of tool mentions in commit messages (Kraishan, 30 Nov 2025).
- Definition schema: Each commit is classified into “explicit,” “implicit,” “mention-only,” or “none,” using a rule-based classifier over tool-specific keywords and patterns.
- Statistical findings: Among 13,617 commits mentioning AI tools, explicit attribution rates vary from 80.5% (Claude) to 1.3% (Tabnine). Logistic regression shows that tool choice is highly predictive of explicit attribution (e.g., Claude odds ratio of 27.49 vs. Copilot).
- Community dynamics: Explicit attribution leads to modestly elevated scrutiny (23% more questions, 21% more comments), but variance is dominated by tool-specific community norms. Sentiment remains neutral regardless of attribution type, and explicit attribution rates have grown rapidly from near 0% in 2024 to 40% by late 2025.
- Design implications: Effective attribution mechanisms are context-dependent; “transparency by design” (e.g., templates, standardized tags) is favored over blanket mandates, given the risk of overburdening users with scrutiny.
5. Limitations and Extensions
- Variance and computation: Monte Carlo estimates of Shapley values trade off variance for computational cost. For tools, sampling ratios must be further reduced, and variance grows if sampling is too aggressive (Horovicz, 14 Dec 2025).
- Tool synergy and interactions: Standard Shapley attribution partitions only individual contributions and does not isolate higher-order tool synergies. Extensions to Shapley interaction indices are under consideration.
- Sequential, causal, or streaming tool use: Current methods treat tool usage as a one-step coalition; multi-turn or causally linked agent architectures require new attributional formalisms or sequential extensions.
- Model and API access: Distributional attribution approaches assume access to model log-probabilities or evaluation oracles, which may not be possible in locked-down APIs (Canonne et al., 25 Jun 2025).
- Normative and ecosystem variability: Social uptake of tool-level attribution practices remains heterogeneous across tool communities and is sensitive to platform design and community leadership (Kraishan, 30 Nov 2025). A plausible implication is that generalizable attribution standards will require modular, adaptable implementation.
6. Future Directions
- Incorporation of interaction Shapley indices and streaming estimates for real-time explanations (Horovicz, 14 Dec 2025)
- Integration of tool-level attribution into agent training pipelines for more discriminative and selective tool reliance
- Hierarchical aggregation combining tool- and token-level attributions (e.g., AgentSHAP with TokenSHAP)
- Mechanism design for attribution in collaborative environments, balancing transparency with sustainability and workflow efficiency
- Robustness studies, including adversarial or mimicry-resistant attribution to distinguish genuine tool use from in-context patching or prompt engineering
7. Summary Table: Tool-Level Attribution Methods
| Approach | Setting | Attribution Formalism |
|---|---|---|
| AgentSHAP | LLM Agents (any API tools) | Monte Carlo Shapley values |
| GenerationPrograms | Text generation (modular) | Program trace union sets |
| Anubis | Code sample source detection | Distributional hypothesis test |
| Glass-Box ETD | Systems performance | Shapley throughput allocation |
| Open-source commit analysis | Software repositories | Rule-based message classification |
The landscape of tool-level attribution now extends from principled game-theoretic credit assignment in AI agents to fine-grained provenance in language tasks and social strategies in collaborative software engineering. Contemporary methods enable both rigorous quantification and interpretable reporting, with ongoing research addressing limitations of scale, interactivity, and ecosystem variance.