Papers
Topics
Authors
Recent
2000 character limit reached

Counterfactual Probability Spaces

Updated 5 January 2026
  • Counterfactual probability spaces are mathematical frameworks that formalize joint probability measures over multiple worlds, integrating both observed and counterfactual scenarios using product spaces.
  • They enable the identification and testability of counterfactual queries by employing causal graphs, synchronization of latent variables, and decomposition into C-components.
  • This framework generalizes structural causal models and potential-outcomes approaches, offering flexible and rigorous methods for causal inference and policy evaluation.

A counterfactual probability space is a mathematical structure underpinning the representation, computation, and testability of counterfactual statements within causal inference. Rooted in structural causal models (SCMs) but generalizing beyond them, such spaces formalize the joint law of random variables across multiple, typically hypothetical, "worlds," capturing both observed reality and specific interventions or alternative scenarios. The concept clarifies which counterfactual queries can be identified from experiments and elucidates the stochastic dependencies between diverse possible worlds.

1. Mathematical Foundations and Definitions

Let WW denote an index set of "worlds," often comprised of a factual world (observed) and one or more counterfactual (hypothetical/intervened) worlds. For each iWi\in W, let (Ωi,Fi)(\Omega_i,\mathcal{F}_i) be a measurable space describing outcomes in world ii. The product measurable space is

(Ω,F)=(iWΩi,  iWFi).(\Omega,\mathcal{F}) = \left(\prod_{i\in W} \Omega_i, \; \bigotimes_{i\in W} \mathcal{F}_i\right).

A counterfactual probability space is the triple (Ω,F,P)(\Omega,\mathcal{F},P), where PP is a probability measure on this product space (Park et al., 1 Jan 2026). Any point ωΩ\omega\in\Omega can be represented as the tuple ω=(ωi)iW\omega = (\omega_i)_{i\in W}, with each component detailing a realization in world ii.

The probability measure PP governs joint events (cross-worlds) and encodes both the observable statistics and the stochastic dependencies — or lack thereof — between the worlds. The mathematical structure generalizes constructs such as potential-outcomes models, canonical representations of SCMs, and couplings from optimal transport-based counterfactuals, but is not constrained by causal diagrams, interventions, or acyclicity assumptions (Park et al., 1 Jan 2026, Lara, 22 Jul 2025).

2. Counterfactual Spaces: Shared Information and Independence

Central to counterfactual probability spaces is the treatment of "shared information" between worlds. This is formalized through the measure-theoretic relationships between the σ\sigma-algebras FF\mathcal{F}^F and FCF\mathcal{F}^{CF} generated by the factual and counterfactual worlds, respectively:

  • World-independence: FFPFCF\mathcal{F}^F \perp_P \mathcal{F}^{CF} if P(AB)=P(A)P(B)P(A\cap B) = P(A)P(B) for all AFF,BFCFA\in\mathcal{F}^F, B\in\mathcal{F}^{CF}. The worlds are stochastically independent.
  • Synchronization: FF#PFCF\mathcal{F}^F \#_P \mathcal{F}^{CF} if for every AFFA\in\mathcal{F}^F there exists BFCFB\in\mathcal{F}^{CF} so that P(AΔB)=0P(A\Delta B)=0, i.e., knowledge of one world's outcomes determines the other.

Most counterfactual modeling in causal inference assumes some degree of synchronization, often via shared latent randomness (e.g., exogenous variables in SCMs), but the axiomatic counterfactual probability space allows for arbitrary intermediate relationships, including conditional independence or partial synchronization (Park et al., 1 Jan 2026). This flexibility strictly generalizes SCM-based semantics.

3. Embedding Causal Models and Testability Criteria

In causal inference, the classical SCM is a tuple M=U,V,F,P(U)M = \langle U, V, F, P(U) \rangle with exogenous variables UU, endogenous variables VV, functions F={fv}F=\{f_v\}, and exogenous distribution P(U)P(U). Counterfactuals such as P(y)P(y), with yy a conjunction of counterfactual events Yxi=yiY_{x_i}=y_i, are defined by evaluating in parallel SCMs for each intervention, with all sharing the same UU (Shpitser et al., 2012, Park et al., 1 Jan 2026).

Testability hinges on identifiability from experiments:

  • A counterfactual query φ\varphi is said to be identifiable (testable) if it is uniquely determined by the family of interventional distributions P={P(Vdo(x))}P^* = \{P(V\mid do(x))\}. Formally, for all SCMs M1,M2M_1, M_2 with the same causal graph and P1=P2P^*_1 = P^*_2, if PM1(φ)=PM2(φ)P_{M_1}(\varphi) = P_{M_2}(\varphi), then φ\varphi is identifiable.
  • The central result is a graphical criterion: a counterfactual P(y)P(y) is identifiable from PP^* under a given graph GG if and only if in the counterfactual graph GyG_y there is no C-component SS containing a variable forced to conflicting values that affects some node in SS (Shpitser et al., 2012).

These criteria are operationalized using graph algorithms (make-cg, ID*, IDC*) that factor and assess the counterfactual graph structure, merging nodes as necessary and recursively analyzing identifiability (Shpitser et al., 2012).

4. Canonical and Alternative Constructions

Canonical Representations of SCMs

A canonical (or process-based) representation translates an SCM into a stochastic process over the product space of potential outcomes or worlds. Each node ii and parent configuration admits a joint measure S(i)S^{(i)} with fixed marginals, and the global counterfactual law is constructed by gluing these measures across the graph (Lara, 22 Jul 2025). Crucially:

  • The observational and interventional distributions (marginals) are preserved, while cross-world dependencies (counterfactual layer) can be freely specified subject to marginal constraints.
  • This approach separates empirical constraints from subjective or metaphysical counterfactual assumptions, allowing for transparent manipulation and comparison of counterfactual conceptions (Lara, 22 Jul 2025).

Generalized Frameworks

The most abstract framework dispenses with interventions entirely, positing that any product-space probability measure suffices for defining counterfactual events (Park et al., 1 Jan 2026). SCMs and potential-outcomes models are special cases embedded within this larger universe. Under mild regularity, all such models map into "symmetric counterfactual probability spaces" by synchronizing latent variables or exogenous randomness.

Table: Representative Model-to-Counterfactual-Space Mappings

Model Class Probability Space Structure Synchronization Mechanism
SCM (Pearl) (U,V,F,P(U))(U, V, F, P(U)); product over UU Deterministic or stochastic UU shared
Potential-outcomes (Ω,F,P)(\Omega, \mathcal{F}, P), with Y(a)Y(a) Potential outcomes indexed by treatment
Transport-based models (Xs×Xs,πss)(\mathcal{X}_s \times \mathcal{X}_{s'}, \pi_{\langle s'|s\rangle}) Couplings (optimal or otherwise)
General counterfactual (iΩi,iFi,P)(\prod_i \Omega_i, \bigotimes_i \mathcal{F}_i, P) Arbitrary; may be independent or synchronized

5. Algorithms and Testable Counterfactuals

Algorithms for counterfactual identification formalize when and how counterfactual probabilities can be computed from experimental data. For example, in acyclic SCMs with known graph structure, the ID* algorithm recursively expresses an unconditional counterfactual P(y)P(y) as a sum over products of interventional distributions corresponding to the C-components of the counterfactual graph (Shpitser et al., 2012). The IDC* algorithm extends this to conditional counterfactuals P(ye)P(y|e), handling evidence that may itself involve cross-world variables.

Summary Algorithmic Steps (ID*):

  1. Check trivial or impossible events.
  2. Merge duplicate variables across worlds based on functional equivalence.
  3. Decompose into C-components and express probability as a product of interventional marginals when possible.
  4. If conflicts persist (e.g., variables forced to conflicting assignments in same C-component), identifiability fails.

Soundness and completeness theorems guarantee that the algorithm yields correct expressions when testability holds, and signals failure otherwise (Shpitser et al., 2012).

6. Connections to Other Frameworks and Generalizations

Counterfactual probability spaces unify and generalize disparate approaches:

  • Potential-outcome spaces as in the Neyman-Rubin causal model are a special case with discrete worlds indexed by treatment (Kawakami et al., 13 Nov 2025).
  • Transport-based models define counterfactuals as couplings between observable conditional distributions, optimizing under a similarity or cost function; under certain conditions, optimal transport couplings reproduce SCM-induced counterfactual maps (Lara et al., 2021).
  • Continuous-time martingale frameworks define counterfactual probability measures via solutions to stochastic differential equations, extending the product-space logic to integrated path spaces and stochastic calculus (Røysland, 2011).
  • Random utility and economic models represent counterfactual choice distributions as solutions to linear inequalities over a mixture simplex, again leveraging the product-space structure for counterfactual queries and bounds (Kitamura et al., 2019).

Moreover, the axiomatic probability space approach is broad enough to encompass models without explicit interventions or latent variables, providing a foundation for both classical and novel counterfactual analyses (Park et al., 1 Jan 2026).

7. Illustrative Examples and Implications

Examples highlight the range and flexibility of counterfactual probability spaces:

  • Independent coins: Product space with independent factual and counterfactual coin tosses; zero shared information (Park et al., 1 Jan 2026).
  • SCM-synchronized outcomes: Identical noise variables uu synchronize factual and counterfactual outcomes, matching SCM semantics.
  • Non-causal transport-based couplings: Optimal-transport mappings coupling empirical distributions to define feasible counterfactuals without recourse to unobserved variables (Lara et al., 2021).
  • Choice under random utility: Patch-based probabilistic modeling constrains plausible counterfactual demand solely by consistency with observed choice data, using linear programs over mixture weights (Kitamura et al., 2019).

A central outcome is that counterfactual identification and testability require not simply the structural postulation of "parallel worlds" but also explicit control and representation of their stochastic coupling, subject to empirical or logical constraints. Counterfactual probability spaces provide the minimal, yet expressive, infrastructure for such analysis, subsuming both SCM-based and non-SCM-based approaches (Shpitser et al., 2012, Lara, 22 Jul 2025, Park et al., 1 Jan 2026).

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to Counterfactual Probability Spaces.