Attribution Graph Formalism

Updated 29 March 2026

Attribution Graph Formalism is a structured framework that models credit allocation and causal influence using weighted nodes and edges to represent entities and their interactions.
It employs rigorous algorithms like row-stochastic normalization and path marginalization to compute influence scores across neural networks, flowcharts, and probabilistic models.
The formalism underpins applications in model interpretability, vision-language reasoning, and causal inference, enhancing transparency and decision-making in complex systems.

An attribution graph formalism provides a rigorous, structured framework for representing, computing, and reasoning about the allocation of credit, responsibility, or explanatory influence across entities in complex systems. Its technical scope spans interpretability in neural networks, vision-language agent reasoning, flowchart and document analysis, probabilistic causality, and semantic model theory. This article surveys state-of-the-art attribution graph formalisms, their mathematical definitions, canonical algorithms, formal properties, and representative applications in model interpretability and computational reasoning.

1. Formal Definitions and Core Graph Structures

Attribution graph formalisms instantiate directed or undirected graphs where nodes represent agents, model components, events, activation patterns, regions, or semantic units, and edges encode causal/contributory relations weighted by appropriate scoring or influence functions.

Example Frameworks

Attributed Flowchart Graph: $G = (V, E, A)$ , where $V$ is the set of nodes (flowchart symbols), $E \subseteq V \times V$ is the set of directed control-flow edges, and $A$ assigns attributes to both nodes (text labels, shapes) and edges (condition strings, arrow styles) (2506.01344).
LLM Context Attribution Graph: $G=(V,E,W)$ , with nodes for all prompt/generated tokens, edges for allowable time-ordered influence, and $W$ a row-stochastic, strictly lower-triangular matrix encoding token-to-token influence scores (Walker et al., 17 Dec 2025).
GNN Attributions: Nodes are graph features (edges, nodes, activation patterns), with edges encoding functional dependencies or information flows, and explicit attribution scores assigned via analytic or gradient-based decompositions (Lu et al., 2024, Xie et al., 2019).
Feature Attribution in Mechanistic Interpretability: Nodes index feature activations (layer, token, feature index triplets), edges represent cross-layer/cross-feature attributions, and edge weights are computed via learned transcoder Jacobians (Draye et al., 22 Mar 2026).
Attribution in Graphical Point Processes: Nodes are event types (e.g., marketing touchpoints, conversion), directed edges encode Granger-causal structure, and edge weights parameterize influence kernels for probabilistic attribution of downstream events (Tao et al., 2023).

The table below highlights the diversity of attribution graph instantiations:

Framework	Vertex Semantics	Edge Semantics	Edge Weights/Labels
Flowchart Attribution (2506.01344)	Diagram entities	Control-flow, conditions	Text labels, semantic match
LLM Context Attribution (Walker et al., 17 Dec 2025)	Prompt/generated tokens	Influence (temporal)	Row-stochastic direct effect
GNN Output Attribution (Lu et al., 2024)	Features/Activations	Message/Activation flow	Monomial-determined shares
Mechanistic Interp. (Draye et al., 22 Mar 2026)	(Layer, Pos, Feature)	MLP Jacobian connections	Circuit-tracer scores
Multi-touch Attribution (Tao et al., 2023)	Event types	Granger causality	Causal coefficients

2. Attribution Score Construction and Path Evaluation

A defining technical element of attribution graph formalisms is the rigorous computation of edge weights and the marginalization or composition of influence along paths.

Edge labels are extracted by OCR-parsing arrow-annotated conditions: $A^E(e) = \mathrm{cond}(u,v)$ .
Region-to-node mapping links image regions (pixel sets) to symbolic nodes: $\phi: \mathcal{R} \to V$ .
Attribution path scores quantify semantic alignment plus edge-label matches, minus path-length penalty:

$S(P) = \sum_{j=1}^{|P|-1}\left[\mathrm{sim}(\mathrm{stmt}(v_j), s) + \mathrm{match}(A^E(e_j), s)\right] - \lambda(|P|-1)$

Attribution path constraints include edge legality, minimum length, semantic thresholding, and exclusivity of explanatory coverage.

Row-stochastic normalization: $\sum_{i<j} w_{ij} = 1$ for each generated token $V$ 0; $V$ 1 for $V$ 2 (causality).
Context attributions for outputs are computed by summing over all paths with edge-weight products, expressible as entries in $V$ 3, where $V$ 4 is the strictly lower-triangular adjacency matrix:

$V$ 5

This establishes a closed-form path-marginalization principle.

The GNN computation is expanded as a sum over monomials $V$ 6 (products of binary adjacency entries, node features, fixed weights, and activation patterns).
Each variable’s contribution to a monomial $V$ 7 is $V$ 8 ("Equal Contribution Lemma").
Node/edge attributions are summed across all monomials in which they appear:

$V$ 9

Local (intra-layer) contribution: for node $E \subseteq V \times V$ 0 at layer $E \subseteq V \times V$ 1 to node $E \subseteq V \times V$ 2 at $E \subseteq V \times V$ 3, $E \subseteq V \times V$ 4.
Hierarchical: contributions of neighbors and their ancestors are computed by multiplying intra-layer attributions along all active paths, with final node attributions summed over all possible paths into the target node.

Attribution score (Direct Removal Effect) for touchpoint set $E \subseteq V \times V$ 5:

$E \subseteq V \times V$ 6

Total Removal Effect accounts for all causal progeny via thinning: recursively propagating removal effects along the directed causality graph.

3. Algorithms and Computational Schemes

Attribution graph construction and inference rely on specialized algorithmic pipelines tailored to their semantics.

Neurosymbolic Attribution Agent (Flowchart): Iterative visual segmentation → graph construction → agentic path search, scoring, and memory update, terminating upon full alignment or convergence (2506.01344).
LLM Attribution Graphs: Sequential per-step base attribution, clamping negatives and normalizing, with dense matrix inversion for all-path aggregation (Walker et al., 17 Dec 2025).
GOAt: Symbolic forward unrolling, per-monomial attribution computation, aggregation, and redistribution of activation-pattern attributions to input features (Lu et al., 2024).
NAM (GCN): Layer-wise forward pass, full gradient backpropagation for every node, and hierarchical accumulation of contributions for all $E \subseteq V \times V$ 7-hop neighbors (Xie et al., 2019).
Circuit-Tracer (Mechanistic Interp.): For each feature activation, Jacobian-based causal computation, edge pruning for sparsity, and layer-wise feature sharing exploitation (Draye et al., 22 Mar 2026).
Point Process Attribution: Recursive or Monte Carlo thinning/backpropagation to propagate removal effects through Granger-causality graphs (Tao et al., 2023).

4. Theoretical Properties and Constraints

Formalism design imposes both representational and algorithmic constraints to guarantee valid, interpretable attributions.

Causality and Row-stochasticity: No edge points backward in generation or time (LLM and point process). Edge weights into any node sum to 1 (ensuring probabilistic interpretability).
Path Legality and Exclusivity: Only valid control-flow or causality-preserving paths are considered. The attribution set must be exclusive to nodes/edges necessary for explanation (2506.01344, Walker et al., 17 Dec 2025).
Sparsity: Many formalisms employ explicit sparsification (e.g., via thresholding on weights or L $E \subseteq V \times V$ 8 regularization), ensuring computational tractability and interpretability (Draye et al., 22 Mar 2026).
Compositionality and Linearity: Some GNN attributions exploit multilinear expansion and linearity to sum over paths or monomials, guaranteeing that local share-splitting is well-defined (Lu et al., 2024).
Structural Coherence: Extended frameworks (e.g., graph-theoretic belief models) impose local/global coherence conditions (absence of contradiction cycles, nonexistence of unsupported beliefs) (Nikooroo, 5 Aug 2025).

5. Applications and Benchmarks

Attribution graph formalisms underpin a suite of interpretability, explainability, and multi-entity credit-assignment applications.

Vision-language reasoning over flowcharts: Neural-symbolic agents (FlowPathAgent) use attribution paths to mitigate LLM hallucination and improve grounded explanation, evaluated on benchmarks (FlowExplainBench) (2506.01344).
Analysis of autoregressive LLMs: Context attribution via graphs (CAGE) demonstrates improved alignment between prompt and output, with up to 40% faithfulness gains on multiple LLMs (Walker et al., 17 Dec 2025).
Mechanistic interpretability for transformer models: Cross-layer attributions elucidate how latent features propagate across layers, enabling scalable interventions and feature-sharing insights (Draye et al., 22 Mar 2026).
Explaining GNN predictions: GOAt and NAM provide complete (GOAt: analytic, NAM: gradient-pathwise) attribution for predictions, enabling node/edge-level saliency visualization and stability/discriminability measurement (Lu et al., 2024, Xie et al., 2019).
Causal credit assignment in multi-touch marketing attribution: Graphical point process formalism quantifies both direct and total causal effect, supporting rigorous per-path attribution strategies relevant for marketing analytics (Tao et al., 2023).

6. Representative Examples

Given a flowchart: $A$ 8 For the query “Which step runs when X is positive?”, the path $E \subseteq V \times V$ 9 is selected, computed by semantic alignment and label match, with a total score exceeding the alternative “No” path.

With prompt tokens $A$ 0 and generations $A$ 1, if the base attributions are:

$A$ 2,
$A$ 3, the row-stochastic normalization yields edge weights, and context attributions sum over direct (prompt-to-generation) and indirect (prompt-to-generation-via-generation) paths via matrix inversion.

A graph $A$ 4, with GNN output $A$ 5, yields attributions $A$ 6, $A$ 7 via the sum over monomial-wise equal splitting.

The attribution graph paradigm intersects with, and extends, several other graph-based frameworks:

Attributed Vertex Replacement Grammars: Encode generative models for attributed graph languages, where splitting and rewiring of attributed subgraphs are formalized via grammar production rules (Sikdar et al., 2021).
Attributed Structures with Cloning: Category-theoretic construction of attributed graph transformation rules with explicit cloning semantics; central for data-rich rewriting and adaptive architectural synthesis (Duval et al., 2014).
Graphical Semantics of Quantification: Directed semantic graphs for natural language meaning representation, combining unification-based construction, second-order variable semantics, and compositionality (Cao, 2021).
Belief and Epistemic Graphs: Distinction between credibility (exogenous, source-level) and confidence (endogenous, structurally-supported) in epistemic graph models (Nikooroo, 5 Aug 2025).

The attribution graph formalism thus enables principled, mathematically precise propagation and localization of influence, fairness, causality, or explanatory power in domains as varied as neural interpretability, vision-language reasoning, causal inference, and structured semantic modeling. Its architecture-specific instantiations and scoring rules are adapted to the semantics of each setting but are unified by their rigorous path-based and compositional allocation of credit or influence.