Structural Causal Models: A Primer
- Structural Causal Models (SCMs) are mathematical frameworks using directed graphs and structural equations to define and analyze causal relationships.
- SCMs enable observational, interventional, and counterfactual inference by systematically applying the do-calculus and altering structural equations during interventions.
- Modern extensions of SCMs address latent variables, cycles, and model scalability through techniques like model compression and integration with neural architectures.
A structural causal model (SCM) is a mathematical formalism that represents collections of variables through deterministic or probabilistic structural equations informed by a graph—typically a directed acyclic graph (DAG)—which encodes hypothesized causal relationships. SCMs specify how endogenous (modelled) variables are generated from their causes and exogenous (latent) sources of variability, providing a generative framework for observational, interventional, and counterfactual inference. The paradigm, codified by Pearl (2000), underpins modern causal modeling and provides a language for how interventions propagate through complex systems, the limits of causal identifiability from data, and precise semantics for concepts such as confounding, mediation, and counterfactuals (Xia et al., 2021, Bongers et al., 2016).
1. Structural Formulation and Graphical Semantics
An SCM is typically formalized as a tuple , where is a collection of exogenous variables (sources of background variation), is the set of endogenous or observable variables, is a collection of structural functions determining each as a function of its parents (other and elements of ), and is a joint distribution over . Each structural equation takes the form
where and . The directed edges of the associated graph connect parent variables to children as dictated by , while unobserved confounders (shared exogenous parents) are represented by bidirected edges in the induced mixed graph. The model supports intervention by replacement: an operation for some replaces the equations for with , effectively severing incoming edges to in the graph.
Syntactically, acyclic SCMs have DAG structure and yield unique solutions for given , but the model class encompasses cyclic and latent-variable (semi-Markovian) cases as well, provided certain solvability conditions hold (Bongers et al., 2016, Xia et al., 2021).
2. Observational, Interventional, and Counterfactual Semantics
SCMs induce well-specified distributions under three regimes:
- Observational: The natural state dictated by , with generated recursively (in acyclic models) or by fixed-point solution (cyclic).
- Interventional: Applying replaces equations for and alters the downstream joint law of . The interventional distribution quantifies the effect of hypothetical manipulations.
- Counterfactual: For a factual , counterfactual inference addresses what would have been under alternative interventions (e.g., ). Technically, this is defined via a “twin network” where the same is reused for both factual and counterfactual structural assignments (Bongers et al., 2016).
The do-calculus provides algebraic rules for relating observational and interventional distributions, underlining identifiability theory (Xia et al., 2021).
3. Identifiability, Causal Discovery, and Expressivity
Identifiability in SCMs addresses whether causal effects (e.g., ) can be derived from the observed joint distribution and known graph structure. The causal hierarchy theorem rigorously separates what is in principle estimable from data (level-1: associational, level-2: interventional, and level-3: counterfactual), regardless of the mechanism class (Xia et al., 2021). Expressivity results show that SCMs parameterized with neural networks are universally expressive, but identifiability is bottlenecked by available data and the causal graph itself, not by functional capacity. Even infinitely expressive NNs cannot recover interventional quantities from observational data alone if the graph is ambiguous. Thus, inductive bias and explicit encoding of structural constraints are necessary for nontrivial causal inference and generalization.
In the context of causal discovery, SCMs are foundational: they undergird principle-based algorithms in constraint-based and score-based structure learning, and their artificial data is used for benchmarking, e.g., with internally-standardized SCMs that avoid trivial “depth artifacts” (Ormaniec et al., 2024).
4. Generalizations: Cycles, Latents, and Non-Classical Extensions
SCMs extend beyond acyclic, fully observed graphs. Cyclic SCMs permit deterministic or probabilistic feedback mechanisms, provided the system is (possibly subset-wise) uniquely solvable (Bongers et al., 2016). Marginalization over latent endogenous variables preserves SCM semantics under unique solvability, and the induced mixed graph represents latent confounding via bidirected edges. Extensions to cyclic and simple SCMs guarantee the existence and uniqueness of all observational/interventional/counterfactual distributions under mild technical conditions.
More recent work emphasizes the limitations of classical SCMs in representing steady-state behavior of deterministic dynamical systems (ODE equilibria), or “functional laws” with nontrivial activation under intervention (e.g., the ideal gas law). Such cases demand further generalizations, as embodied by the Causal Constraints Model (CCM) formalism (Blom et al., 2018), which equips each constraint with an explicit activation set for interventions and supports non-algebraic stationary constraints.
5. Algorithmic Approaches and Model Compression
Multiplicity and complexity of variables in large-scale SCMs has motivated model compression strategies. Consolidation operations, which merge (compose) mechanisms to form higher-level, black-box “aspect variables” (CCVs), yield consolidated SCMs that strictly generalize marginalization. Crucially, consolidation preserves the full set of interventional distributions on the remaining variables—unlike naive marginalization—while dramatically reducing computational overhead and facilitating interpretation (Willig et al., 2023). Algorithms for consolidation construct compositional mappings from original exogenous variables and interventions to high-level observable variables, with guarantees of minimal sufficient statistics for target queries.
Abstraction between SCMs at different levels of granularity is formalized via interventional-consistency maps between variable sets and exogenous distributions, providing a critical link for causal representation learning, modularity, and applications requiring mechanistic fidelity, projection, or compositionality (Zennaro, 2022).
6. SCMs in Causal Representation Learning and Machine Learning
Recent innovations connect SCMs to neural architectures and causal representation learning paradigms. Graph neural networks (GNNs), as universal function approximators on graph-structured data, can express any SCM through suitable parameterization and retain precise identification-theoretic semantics: what is identifiable from a GNN-based iVGAE is equivalent to what is identifiable from the SCM's graph and observed distribution (Zečević et al., 2021). However, the separation between observational and interventional levels is fundamental—no training protocol on observational data alone can break the do-calculus barrier imposed by the SCM's structure.
Moreover, SCMs provide the mathematical backbone for individualized causal effect estimation (ICE), operationalizing personalized inference by conditioning on individual-specific variables via abduction (“indiv-operator”) and thereafter propagating hypothetical interventions (Chang, 17 Jun 2025). This aligns with precision medicine and counterfactual image generation in medical contexts (Reinhold et al., 2021).
7. Applications, Limitations, and Outlook
SCMs supply a unified framework for causal reasoning in economics, biology, medicine, and beyond. They accommodate time-series and dynamic systems via dynamic SCMs with robust identifiability criteria (Ferreira et al., 2023), and have been adapted for extremes, tail-based dependence, and rare events, e.g., in hydrology and finance (Jiang et al., 12 May 2025, Engelke et al., 9 Mar 2025).
Limitations include non-identifiability in the absence of sufficient interventions or domain knowledge, the inability to capture steady-state multiplicity or invariant functional relationships without extension (requiring CCMs), and challenges in high-dimensional model selection, particularly in tree-structured or latent-confounded graphs (Gupta et al., 2023, Bjøru et al., 17 Nov 2025).
SCMs remain a central conceptual and algorithmic framework for modern causal inference, with continuous advancement in expressivity, identifiability theory, model compression, and their integration with flexible, data-driven function classes (Xia et al., 2021, Ormaniec et al., 2024, Willig et al., 2023, Zečević et al., 2021).