Causal-Driven Attribution (CDA)

Updated 4 July 2026

Causal-Driven Attribution (CDA) is a framework that assigns credit based on causal effects rather than correlations using structural causal models and counterfactual comparisons.
It employs methodologies like PCMCI, front-door adjustment, and Shapley value adaptations to isolate true causal drivers across various application domains.
Recent empirical evaluations demonstrate CDA’s potential in marketing, model interpretation, and multi-agent systems despite challenges such as hidden confounding and model misspecification.

Searching arXiv for papers on Causal-Driven Attribution and closely related causal attribution frameworks. Causal-Driven Attribution (CDA) denotes attribution procedures that assign credit, blame, or explanatory responsibility according to identified causal effects rather than correlations. In the literature, the term appears explicitly in settings such as channel attribution from aggregated impression-level data, where CDA combines temporal causal discovery with structural causal effect estimation (Filippou et al., 24 Dec 2025), and in creator-ecosystem optimization, where attribution is defined as assigning credit to touchpoints based on identified causal effects rather than correlations (Liu et al., 9 May 2026). Closely related work applies the same principle to multi-agent system debugging, black-box model interpretation, climate detection and attribution, and human–AI responsibility, even when the label “CDA” is not used verbatim (Ma et al., 10 Sep 2025, Khademi et al., 2020, Risser et al., 2024, Qi et al., 2024). The literature suggests that CDA is best understood not as a single estimator but as a family of methods built around structural causal models, potential outcomes, do-interventions, temporal causal graphs, and counterfactual comparisons.

1. Concept and scope

At its core, CDA is a reaction against purely associational attribution. In recommendation and marketing systems, correlational multi-touch attribution such as last-touch, time-decay, or Shapley scores without causal controls is described as failing under unobserved confounding and shortcut leakage (Liu et al., 9 May 2026). In model explanation, correlations, gradients, PDP/ICE, LIME, and SHAP are described as capturing associational or local sensitivity rather than interventional effects, so they can confound spurious correlations with causal drivers when features are correlated or structurally related (Khademi et al., 2020). In multi-agent systems, current diagnostic tools are described as relying on statistical correlations and achieving less than 15\% accuracy in locating the root-cause step of a failure on challenging benchmarks such as Who{data}When, motivating causal failure attribution (Ma et al., 10 Sep 2025).

The term also spans several attribution targets. Some papers treat channels, touchpoints, or exposures as the units of causal credit; others attribute outcomes to features, latent concepts, agents, execution steps, forcings, or decision makers. The common requirement is that the attributed quantity be tied to a causal path, a causal mechanism, or a counterfactual contrast, not merely to predictive usefulness. Several papers state this directly by defining attribution as the effect of setting a variable, deleting a touchpoint, replacing a mechanism, or switching from a factual to a counterfactual world (Filippou et al., 24 Dec 2025, Chattopadhyay et al., 2019, Quintas-Martinez et al., 2024).

Setting	Attribution target	Representative formulation
Aggregated marketing	Channels	PCMCI + SCM + do-interventions
Creator recommendation	Touchpoints	Front-door mediator + IPW
Multi-agent systems	Agents and steps	PCI + Shapley + CDC-MAS
Model explanation	Features or latent factors	Potential outcomes or SCM counterfactuals
Climate and human–AI systems	Forcings, mechanisms, or agents	Statistical counterfactuals or responsibility SCMs

A further distinction concerns whether CDA is descriptive, diagnostic, or actionable. In some works it stops at estimating channel influence or feature effect. In others it closes a diagnose–validate–optimize loop, as in multi-agent debugging and creator-side optimization, where the attribution result directly drives interventions and re-execution (Ma et al., 10 Sep 2025, Liu et al., 9 May 2026).

2. Causal formalization and estimands

Most CDA formulations start from either a structural causal model or a potential-outcomes formulation. In black-box model interpretation, the predictor is treated as producing an observed outcome $\hat Y = f(X)$ , each feature $X_k$ is treated as a treatment, and attribution is defined through estimands such as

$\text{ATE}_k(a,b) = \mathbb{E}\big[\hat Y(X_k=a)-\hat Y(X_k=b)\big]$

and

$\tau_k(a,b \mid x_{-k}) = \mathbb{E}\big[\hat Y \mid \operatorname{do}(X_k=a), X_{-k}=x_{-k}\big]-\mathbb{E}\big[\hat Y \mid \operatorname{do}(X_k=b), X_{-k}=x_{-k}\big].$

Identification is then expressed through the g-formula

$\mathbb{E}[\hat Y \mid \operatorname{do}(X_k=a)] = \int \mathbb{E}[\hat Y \mid X_k=a, X_{-k}=x]\, dP(X_{-k}=x),$

under ignorability and positivity assumptions (Khademi et al., 2020). Neural-network CDA makes the same interventional move but interprets the network architecture itself as an SCM and defines attribution as an Average Causal Effect of an input feature on the output (Chattopadhyay et al., 2019).

In channel and touchpoint attribution, the estimands are often temporal and mechanism-aware. In aggregated marketing data, CDA defines an average causal effect for a channel-level intervention as

$ACE_i(\Delta)=\mathbb{E}[Y_t \mid \operatorname{do}(X_{i,t}:=X_{i,t}+\Delta)]-\mathbb{E}[Y_t],$

and decomposes total effect along directed paths recovered by PCMCI and an SCM (Filippou et al., 24 Dec 2025). In creator-ecosystem optimization, the central identification device is front-door adjustment through a mediator $M$ , yielding

$\mathbb{E}[Y \mid \operatorname{do}(T=t)] = \sum_{m,t',x}\mathbb{E}[Y \mid M=m, T=t', X=x]\,P(T=t',X=x)\,P(M=m \mid T=t, X=x),$

with inverse propensity weighting used for observed backdoor adjustment and deletion-based uplift

$\hat\Delta(\tau_j \mid X,T)=g_\theta(X,T)-g_\theta(X,T\setminus\{\tau_j\})$

used for per-touchpoint attribution (Liu et al., 9 May 2026).

Other domains use different causal functionals. Climate attribution alternates between predictive, Granger-causal quantities and Pearl-style interventional quantities such as the risk ratio

$RR=\frac{P(E \mid \operatorname{do}(A=1))}{P(E \mid \operatorname{do}(A=0))}$

and the fraction of attributable risk

$X_k$ 0

while DADA computes likelihoods of full observed trajectories under factual and counterfactual forcings and converts them into the probability of necessary causation (Risser et al., 2024, Hannart et al., 2015). Human–AI collaboration work defines blameworthiness from differences in outcome probabilities across policies,

$X_k$ 1

and then discounts it by expected cost (Qi et al., 2024).

3. Recurrent methodological motifs

A major recurring motif is causal structure discovery before attribution. Aggregated marketing CDA uses PCMCI to discover a directed temporal graph over channel impressions, conversions, and optional covariates, then estimates structural equations on the discovered parents (Filippou et al., 24 Dec 2025). Causal SHAP uses the PC algorithm to recover a CPDAG and IDA to quantify causal strength, then modifies the SHAP value function so that coalitions are evaluated under graph-consistent interventions rather than under feature independence (Ng et al., 31 Aug 2025). Climate work similarly frames Granger-causal attribution through restricted and unrestricted VARs with controls, and then recommends hybridizing this predictive perspective with Pearl-style counterfactuals from dynamical experiments (Risser et al., 2024).

A second motif is explicit reweighting or mediator-based identification. Potential-outcomes-based model explanation uses CBPS, NPCBPS, PSWGBM, OPTWEIGHT, IPTW, and Super Learner propensity modeling to balance covariates across treatment levels and estimate dose–response functions (Khademi et al., 2020). Task-Driven Causal Feature Distillation treats each feature as an intervention, estimates propensity scores with adaptive group Lasso, and converts raw inputs into causal feature attributions $X_k$ 2 before downstream prediction (Chu et al., 2023). Multiply-Robust Causal Change Attribution generalizes this strategy by combining nested regressions and Radon–Nikodým weight ratios so that, for each mechanism step, either the regression or the weight can be misspecified while still preserving identification of the target parameter (Quintas-Martinez et al., 2024).

A third motif is cooperative decomposition under causal semantics. In multi-agent systems, SBSLocator treats agents as players in a cooperative game whose worth is expected performance under causal interventions, then approximates Shapley values by Monte Carlo and combines them with counterfactual improvement into an integrated bottleneck score (Ma et al., 10 Sep 2025). Causal SHAP similarly weights Shapley-like marginal contributions by a causal strength factor derived from IDA-estimated path effects, while maintaining local accuracy through normalization (Ng et al., 31 Aug 2025). Change attribution work embeds multiply robust estimators inside Shapley aggregation over shifted mechanisms, so that Shapley values inherit consistency and asymptotic normality from the underlying estimator (Quintas-Martinez et al., 2024).

A fourth motif is counterfactual validation. Visual-model CDA generates counterfactual latent factors from Distributional Causal Graphs and estimates instance-level effects by averaging classifier outputs over counterfactual images (Parafita et al., 2019). Open-world model attribution by Counterfactually Decoupled Attention Learning defines a causal effect as the gap between factual and counterfactual attention-weighted predictions and uses that effect as a supervisory signal to force attention toward model-specific artifacts rather than source bias (Zheng et al., 29 Jun 2025). In multi-agent debugging, counterfactual validation is integrated into an optimization loop that generates targeted suggestions, simulates interventions, and retains changes only if re-execution improves performance (Ma et al., 10 Sep 2025).

4. Domain-specific instantiations

In multi-agent systems, CDA is instantiated as a multi-granularity failure attribution pipeline. A failed execution trajectory is represented as

$X_k$ 3

with agent $X_k$ 4, action $X_k$ 5, time $X_k$ 6, and context $X_k$ 7. A data dependency graph $X_k$ 8 is built from input–output containment and then transformed by the performance causal inversion principle into a performance causal graph $X_k$ 9 by reversing the edges. Agent-level attribution is then handled by Shapley-based SBSLocator, step-level attribution by CDC-MAS and counterfactual validation, and the resulting diagnosis is fed into a closed-loop optimization mechanism (Ma et al., 10 Sep 2025).

In marketing and recommendation, CDA appears in two distinct forms. One form eliminates user-level paths entirely: aggregated impressions and conversions are modeled as multivariate time series, PCMCI discovers lagged interdependencies, and an SCM is used to compute direct effects, indirect effects, windowed contributions, and normalized attribution shares without user identifiers or click-path tracking (Filippou et al., 24 Dec 2025). A second form uses very large-scale non-RCT logs and front-door identification. ALM-MTA models exposures $\text{ATE}_k(a,b) = \mathbb{E}\big[\hat Y(X_k=a)-\hat Y(X_k=b)\big]$ 0, a latent mediator $\text{ATE}_k(a,b) = \mathbb{E}\big[\hat Y(X_k=a)-\hat Y(X_k=b)\big]$ 1, an adversarial proxy $\text{ATE}_k(a,b) = \mathbb{E}\big[\hat Y(X_k=a)-\hat Y(X_k=b)\big]$ 2, and outcome $\text{ATE}_k(a,b) = \mathbb{E}\big[\hat Y(X_k=a)-\hat Y(X_k=b)\big]$ 3, then combines front-door adjustment, adversarial mediator learning, contrastive conditioning for positivity, and grouped AUUC for causal evaluation at the scale of 400 million DAU and 30 billion samples (Liu et al., 9 May 2026). A related but more pragmatic system is LiDDA, which combines a transformer-based bottom-up sequence model, a top-down Media-Mix Model, randomized holdout experiments, and inverse propensity weighting to reconcile touchpoint-level credits with macro-level incrementality (Bencina et al., 14 May 2025).

In model explanation, several lines of work implement CDA with different causal primitives. Black-box interpretation uses potential outcomes and generalized propensity scores to estimate global and local feature effects from input–output observations only (Khademi et al., 2020). Neural-network attribution treats the architecture as an SCM and computes interventional expectations for each feature under do-operations, including temporal extensions to recurrent networks (Chattopadhyay et al., 2019). Visual explanation uses causal graphs over semantic latent factors and counterfactual image generation rather than pixel masking (Parafita et al., 2019). TDCFD distills each original feature into a task-specific causal attribution before applying a downstream predictor (Chu et al., 2023). Causal SHAP uses discovery-driven dependency awareness to suppress importance assigned to merely correlated features (Ng et al., 31 Aug 2025).

Several works push CDA beyond ordinary feature attribution. ProMark proactively embeds concept-specific watermarks into training data and trains latent diffusion models to retain them, so that detecting a watermark in a generated image is used as a causal certificate that the corresponding concept influenced generation (Asnani et al., 2024). CDAL uses counterfactual attention learning for open-world source-model attribution, explicitly separating model-specific generation artifacts from source bias (Zheng et al., 29 Jun 2025).

Climate and sociotechnical systems provide yet another interpretation of CDA. DADA treats meteorological data assimilation outputs as likelihoods of observed trajectories under factual and counterfactual forcings, yielding near-real-time causal attribution in terms of necessary causation (Hannart et al., 2015). Granger-causal climate attribution frames trend and event attribution through observation-based statistical counterfactuals, then argues for hybrid use with Pearl-causal dynamical experiments (Risser et al., 2024). In human–AI collaboration, responsibility attribution is defined over policies and counterfactual reference systems, with avoidability and flagging status determining whether blame lies with the human, the AI, or the flagging mechanism (Qi et al., 2024). Design-time agent specification work extends the idea one step further: 4D-ARE treats attribution as the organizing principle for what an LLM agent should reason about, decomposing organizational explanations into Results, Process, Support, and Long-term dimensions (Yu et al., 8 Jan 2026).

5. Empirical performance and validation

Empirical evaluation is highly domain-specific, but a recurrent pattern is that causal attribution is judged both by attribution quality and by the usefulness of decisions driven by the attribution. In aggregated marketing data, when the true graph is provided, CDA achieves an average relative RMSE of 9.50\%, MAPE of 6.12\%, and Spearman $\text{ATE}_k(a,b) = \mathbb{E}\big[\hat Y(X_k=a)-\hat Y(X_k=b)\big]$ 4 of 0.89 across 1,000 simulations; with a PCMCI-predicted graph, average relative RMSE is 24.23\%, MAPE is 19.80\%, and Spearman $\text{ATE}_k(a,b) = \mathbb{E}\big[\hat Y(X_k=a)-\hat Y(X_k=b)\big]$ 5 is 0.53, indicating degradation under structural uncertainty but still meaningful signal recovery (Filippou et al., 24 Dec 2025). In creator-ecosystem optimization, ALM-MTA reports offline upload AUC 0.907, grouped AUUC higher in every propensity bucket with maximum gain +0.070, online DAU +0.04\%, daily active creators +0.6\%, and unit exposure efficiency +670\% (Liu et al., 9 May 2026).

In multi-agent failure attribution, the reported gains are operational rather than merely explanatory. On Who{paper_content}When and TRAIL, the causal framework reaches up to 36.2\% step-level accuracy, 48.5\% or 56.8\% agent-level accuracy depending on subset, and increases task success rate from 15.4\% to 37.8\%, an average improvement of 22.4 percentage points; ablations show that removing PCI, Shapley, context conditioning, or counterfactual reasoning degrades performance (Ma et al., 10 Sep 2025). In Causal SHAP, synthetic experiments show the expected suppression of non-causal correlated variables and the lowest reported RMSE on the first synthetic task, while real-world insertion tests yield AUROC 0.8594 on IBS and 0.6271 on colorectal cancer (Ng et al., 31 Aug 2025).

Model-level CDA is frequently validated qualitatively as well as quantitatively. In black-box interpretation, synthetic experiments recover the true impactful variables, MNIST interventions on causal latent variables visibly remove class-defining structure, and causal effects on Parkinson’s disease severity predictions align with known voice markers (Khademi et al., 2020). TDCFD reports 0.97 accuracy, 0.82 precision, and 0.90 recall on synthetic risk prediction, and 0.96 accuracy, 0.86 precision, and 0.80 recall on a real corporate risk dataset (Chu et al., 2023). ProMark reports attribution accuracy that remains strong even as the number of possible concepts increases up to $\text{ATE}_k(a,b) = \mathbb{E}\big[\hat Y(X_k=a)-\hat Y(X_k=b)\big]$ 6 and under 14 degradations, while adding only a small FID increase relative to a non-watermarked conditional LDM (Asnani et al., 2024).

6. Limitations, misconceptions, and open directions

A central limitation across CDA methods is that causal attribution is only as reliable as its identification assumptions. Black-box feature attribution requires ignorability and positivity (Khademi et al., 2020). Aggregated marketing CDA assumes causal sufficiency, temporal causal Markov, faithfulness, approximate stationarity, and correct lag coverage (Filippou et al., 24 Dec 2025). Causal SHAP depends on causal sufficiency, acyclicity, and linear-Gaussian assumptions for PC and IDA (Ng et al., 31 Aug 2025). Front-door recommendation attribution assumes path containment, no unmeasured mediator–outcome confounding given $\text{ATE}_k(a,b) = \mathbb{E}\big[\hat Y(X_k=a)-\hat Y(X_k=b)\big]$ 7 and $\text{ATE}_k(a,b) = \mathbb{E}\big[\hat Y(X_k=a)-\hat Y(X_k=b)\big]$ 8, and adequate positivity in large treatment spaces (Liu et al., 9 May 2026). Multiply-robust change attribution still requires overlap and a correct causal ordering or DAG (Quintas-Martinez et al., 2024).

A common misconception is that any attention weight, Shapley score, or path importance score is already causal. Several papers explicitly reject this. LiDDA states that attention is not causality, even when it is useful for operational attribution (Bencina et al., 14 May 2025). Causal SHAP is motivated by the claim that standard SHAP fails to differentiate between causality and correlation under dependence (Ng et al., 31 Aug 2025). Visual and open-world attribution papers likewise insist that counterfactual or interventional semantics are necessary; otherwise, attribution remains vulnerable to spurious statistical correlations, source bias, or impossible feature combinations (Parafita et al., 2019, Zheng et al., 29 Jun 2025).

Recurring open problems are also shared across domains. Hidden confounding, mediator misspecification, domain shift, sparse logs, long trajectories, and feedback loops are repeatedly identified as failure modes (Ma et al., 10 Sep 2025, Liu et al., 9 May 2026, Filippou et al., 24 Dec 2025). Climate work highlights the limits of instantaneous or purely predictive causality and argues for hybridization of Granger and Pearl perspectives (Risser et al., 2024). Human–AI responsibility work adds a normative complication: blame attribution may need to be evaluated under an adjusted epistemic standard rather than under the raw data-generating distribution (Qi et al., 2024). The literature therefore points toward richer context models, cyclic or dynamic causal models, online causal discovery, sensitivity analysis under graph uncertainty, and tighter integration between attribution and intervention selection as the main directions for further development.