Counterfactual Queries in Causal Inference

Updated 25 June 2026

Counterfactual queries are formal questions that evaluate potential outcomes by intervening on variables within structural causal models, forming a basis for causal inference.
They follow a rigorous three-step process—abduction, action, and prediction—to modify models and compute outcomes based on counterfactual scenarios.
They are applied to estimate individual effects, audit algorithmic fairness, and assess policy impacts through methods like symbolic compilation and neural modeling.

A counterfactual query is a formal query, typically posed within a structural causal model (SCM) or a related framework, that asks “what would have happened to variable(s) Y had variable(s) X been assigned a (possibly contrary-to-fact) value x, possibly given that in reality, evidence E=e was observed?” These queries, at the core of the “third rung” of Pearl’s causal hierarchy, are fundamental to formalizing “what-if” reasoning, auditing model/algorithmic fairness, estimating individual-level causal effects, and understanding system-level policy impacts. Counterfactual queries admit a rigorous semantics via a three-step process: abduction (inferring exogenous variables given evidence), action (intervening on structural equations), and prediction (propagating the intervention to obtain post-interventional outcomes), frequently operationalized through SCMs, logic programs under FCM-semantics, or domain-specific substrate models.

1. Formal Definition and Computation in Structural Causal Models

Counterfactual queries are grounded in the semantics of structural causal models. An SCM is a tuple $M = (U, V, F, P_U)$ , where $U$ are exogenous variables capturing system noise or background, $V$ are observed or endogenous variables, and $F = \{ f_1, \ldots, f_n \}$ is a collection of deterministic structural equations $V_i = f_i(\text{Pa}_i, U_i)$ . $P_U$ is a joint distribution over the exogenous variables.

A prototypical counterfactual query has the form

$P(Y_{do(X=x)} = y \mid E = e)$

which is read: “What is the probability that $Y$ would have been $y$ had we intervened to set $X = x$ , given that we in fact observed $U$ 0?” The accepted procedure [(Balke et al., 2013); (Balke et al., 2013)] follows:

Abduction: Update the distribution over $U$ 1 to $U$ 2.
Action: Modify $U$ 3 by replacing the structural equations for $U$ 4 with $U$ 5 (i.e., apply $U$ 6), severing the original dependencies into $U$ 7.
Prediction: Compute the counterfactual value of $U$ 8 in the modified model, marginalizing over $U$ 9 as $V$ 0, and averaging with respect to $V$ 1. This yields

$V$ 2

This approach generalizes to nonparametric, cyclic, and stochastic settings, as well as to linear-Gaussian SEMs, where all steps admit closed-form solutions when coefficients and noise distribution are known (Balke et al., 2013).

2. Identifiability, Bounds, and Algorithmic Frameworks

Often, the structural equations or the exogenous distribution in an SCM are not uniquely determined by observed data; counterfactuals are then only partially identifiable. In this case, one seeks to bound

$V$ 3

across all models consistent with the observed conditional distributions. This can be formulated as a constrained optimization problem or as linear programs when the counterfactual expression is linear in the “response function” variables (Balke et al., 2013).

For partially identified queries in semi-Markovian SCMs, symbolic knowledge compilation (e.g., compiling to arithmetic circuits) accelerates the evaluation of many counterfactual queries, enabling iterated EM bounding by reusing circuit structure for different exogenous priors (Huber et al., 2023). These techniques can yield order-of-magnitude speedups and scalable inner approximations to true counterfactual bounds.

Identification of counterfactuals from observed and interventional data is algorithmically solved by the ID* and IDC* algorithms (Tikka, 2022), which decompose queries via counterfactual graphs, C-component partitions, and graphical d-separation, returning an identifying functional in terms of observable or interventional distributions or declaring the query unidentifiable.

3. Extensions: Probabilistic Programming, Logic, and Other Formalisms

Counterfactual queries generalize beyond SCMs to probabilistic logic programming, e.g., ProbLog, where the system of mutually independent Boolean random variables and recursive clauses is interpreted via FCM-semantics (Rückschloß et al., 2023, Kiesel et al., 2023). Interventions are implemented as “surgical” clause replacements, and counterfactuals are answered by a twin-network construction: duplicating the program, intervening in the counterfactual “copy,” and querying with shared external variables.

In probabilistic logic programs that are acyclic, proper, and positive in normal form, the underlying program can be reconstructed from observational data alone, and all counterfactual queries are computable by model enumeration or weighted model counting, with complexity exponential in maximum parent set size, but polynomial otherwise (Rückschloß et al., 2023).

4. Empirical Applications and Fairness Auditing

Counterfactual queries serve in bias measurement: the methodology in Ghai et al. (Ghai et al., 2020) adapts counterfactual fairness concepts to audit individual crowd-worker labeling bias. Given queries with protected attributes $V$ 4 and features $V$ 5, workers are presented with original and counterfactual (attribute-flipped) instances, interleaved and disguised. Worker bias is scored as

$V$ 6

where higher scores indicate greater sensitivity to protected attributes under minimal-change interventions. This logic generalizes to fairness auditing for algorithms or in any setting where human judgments enter ML pipelines.

In knowledge-based evaluation of VQA models, counterfactual word-level perturbations informed by WordNet or color ontologies are used to systematically probe model sensitivity and uncover latent biases or robustness gaps (Stoikou et al., 2023).

5. Deep Generative and Neural Approaches to Counterfactual Queries

Recent work leverages invertible flows, diffusion models, and variational graph autoencoders to perform counterfactual estimation in high-dimensional, autoregressive, or time series domains, always via abduction–action–prediction decompositions. Under sufficient model identifiability conditions (e.g., known causal graph, independent noises, invertibility), these architectures guarantee or empirically achieve accurate estimation of counterfactual outcomes for nontrivial queries (Chao et al., 2023, Wu et al., 4 Nov 2025, Sanchez-Martin et al., 2021).

Special attention is given to identifiability in the presence of categorical unobserved variables, where mixture-model–based neural EM approaches yield consistent counterfactual inference under mild clusterability conditions (Brouwer, 2022).

For the extraction and security assessment of linear models, a single (robust) counterfactual query under a differentiable norm suffices to extract the decision boundary, with the complexity scaling linearly in dimension for polyhedral norms (Otto et al., 10 Feb 2026).

6. Domain-Specific and Generalizations: Substrate Models and Quantum Causality

Event-graph substrate models represent complex systems as an append-only log of RDF triples, enabling counterfactual queries via causal-ancestor graph traversals and log forking, with exact replay semantics across heterogeneous domains (Rovai, 15 May 2026).

In quantum causal models, counterfactuals generalize classical do-operator semantics: standard counterfactuals correspond to classical “active” interventions, but quantum formalism distinguishes between “active” and “passive” counterfactuals. Quantum instruments and process operators support situations where counterfactual dependence may occur in the absence of direct causal dependence, e.g., in Bell-inequality–violating scenarios (Suresh et al., 2023).

7. Practical Considerations, Limitations, and Empirical Results

The tractability of counterfactual queries depends on the structural properties of the causal model (acyclicity, bounded treewidth, noise structure) and the availability of interventional data.
Bounding techniques are necessary whenever the causal model is only partially identified from observed or interventional data, with symbolic knowledge compilation and randomized EM providing practical computational advantages (Huber et al., 2023).
For dynamic and latent-state models, counterfactual analysis proceeds by sampling latent state trajectories consistent with observed evidence, then optimizing over possible coupling structures to bound outcomes (Haugh et al., 2022).
In information retrieval and recommender systems, counterfactual query rewriting with historical feedback can leverage relevance signals from document snapshots and outperform both standard and transformer-based systems by precomputing expanded or “key” queries optimized for prior relevance, even as the index evolves (Keller et al., 6 Feb 2025).

In sum, counterfactual queries are now a foundational primitive for causal inference, fairness, robustness, explainability, and system diagnostics, with a converging set of algorithmic, logical, generative-model, and domain-specific approaches yielding both theoretical guarantees and practical advances.