Structural Causal Models

Updated 12 November 2025

Structural Causal Models are formal frameworks that represent cause-effect mechanisms via structural equations and directed acyclic graphs.
They underpin causal inference by enabling observational, interventional, and counterfactual queries essential for decision-making.
Recent extensions incorporate cyclic relations, dynamic systems, and extreme value theory to address complex real-world challenges.

A structural causal model (SCM) is a formal framework for representing cause–effect mechanisms through systems of structural equations on a set of endogenous (modeled) and exogenous (noise or background) variables. Each variable is a deterministic or stochastic function of its parents, with the causal structure encoded in a directed acyclic graph (DAG). SCMs underlie modern approaches to causal inference, causal discovery, counterfactual reasoning, and interventional prediction, and they serve as the statistical backbone for disciplines ranging from genetics to climate science and autonomous systems. SCM theory has expanded to encompass cyclic relations, latent confounders, time-dependent dynamics, benchmarking artifacts, and new developments in the modeling of extremes. Below, key dimensions of the field are treated in depth.

1. Mathematical Formalism and Structural Equations

The classical nonparametric Structural Causal Model is specified by a quadruple $\mathcal{M} = (\mathbf{U}, \mathbf{V}, \mathcal{F}, P(\mathbf{U}))$ , where:

$\mathbf{V} = \{V_1,\dots,V_d\}$ are endogenous variables,
$\mathbf{U} = \{U_1,\dots,U_m\}$ are exogenous (noise) variables,
$\mathcal{F} = \{f_1,\dots,f_d\}$ are structural assignments $V_i = f_i(\mathrm{Pa}_i, U_i)$ , often with the parent set $\mathrm{Pa}_i \subseteq \mathbf{V} \setminus \{V_i\}$ ,
$P(\mathbf{U})$ is a joint law (often product) over the exogenous noise.

This induces a directed graph where each edge $V_j \rightarrow V_i$ encodes a direct causal influence. Observational distributions are obtained by solving the system and marginalizing the noise. The do–operator (intervention semantics) modifies $\mathcal{F}$ by replacing $f_i$ with the constant $x$ for $i \in I$ ; interventional and counterfactual distributions are generated by corresponding surgeries and abduction–action–prediction routines (see (Bongers et al., 2016, Zečević et al., 2021)).

Acyclic SCMs yield unique solutions and well-behaved distributions, supporting do–calculus (Pearl) and sound graphical Markov properties. Cyclic SCMs require unique solvability conditions for each strongly connected component to maintain consistency, uniqueness, and graphical faithfulness (see (Bongers et al., 2016)).

2. Causal Inference: Observational, Interventional, and Counterfactual Distributions

SCMs anchor causal inference through:

Observational queries: via the joint distribution induced by structural assignments and the noise law.
Interventional queries: via surgery on the structural equations $\mathcal{F}$ —fixing variables to chosen values, removing incoming edges (Pearl’s do–calculus).
Counterfactual queries: via the twin–network construction, where factual and counterfactual worlds share exogenous inputs but impose interventions in the counterfactual copy; inference proceeds by abduction (conditioning on data), action, and prediction.

Interventional distributions $P(Y \mid do(X=x))$ play a foundational role in identifiability, mediation analysis, and process optimization (Galanti et al., 2020), while counterfactuals $P(Y_{do(X=x)} \mid X=x', Y=y')$ underpin explanations and responsibility analysis (Zaffalon et al., 2020, Zečević et al., 2021).

3. Causal Graphs, Markov Properties, and Identifiability

The structural equations induce a causal (directed mixed) graph whose topology determines allowable independence constraints via d-separation (in DAGs) or σ-separation in more general graphs.

For acyclic SCMs, the directed global Markov property holds: d-separation in the induced graph implies conditional independence in the observational distribution (Bongers et al., 2016). In the presence of cycles, unique solvability on each component is required for σ-separation to encode Markov properties.

Identifiability of causal structure from data is a central problem. Under equal-variance Gaussian linear assumptions and acyclicity, the Markov equivalence class can sometimes be distinguished (Var-sortability, R²-sortability) (Ormaniec et al., 17 Jun 2024), but these can be benchmarking artifacts. Non-Gaussianity or mechanisms such as independent noise models (LiNGAM) can further aid identifiability. Internally-standardized SCMs (iSCMs) resolve signal-to-noise drift, restoring classical identifiability up to Markov equivalence (Ormaniec et al., 17 Jun 2024).

Recent developments for extremes, such as extremal SCMs (eSCMs), exploit exponent measures and asymptotic causal asymmetry, achieving full orientation of the DAG under natural conditions—beyond Markov equivalence—through angular support of the exponent measure (Fang et al., 1 Aug 2025).

4. Extensions: Time-Series, Dynamical Systems, and Extreme Values

Time-Series and Dynamical Systems

SCMs have been generalized to dynamic settings:

Structural Dynamical Causal Models (SDCMs) (Bongers et al., 2018) allow for time-indexed stochastic processes and random differential equations, encoding both temporal evolution and intervention effects.
Dynamic Structural Causal Models (DSCMs) (Rubenstein et al., 2016) recover equilibrium and asymptotic behavior of ODEs under time-dependent interventions, providing a framework for dynamic causal queries beyond static equilibrium analysis.
Macro-variable SCMs for coarse-grained time-series (especially in the frequency domain) facilitate causal analysis at the aggregate level, preserving intervention semantics (Janzing et al., 2018).

Structural Causal Models for Extremes

Recent work has equipped SCMs with the apparatus of extreme-value theory. By considering the tail behavior and regular variation, limit models described by multivariate Pareto distributions (with exponent measures) arise. Tail-regularity of the structural assignments ensures that the limiting graphical structure (the “extremal graph”) may shed edges that vanish in the tails (Engelke et al., 9 Mar 2025). Extremal conditional independence and extremal graphical models further enable structure learning with tests adapted to extremes, avoiding the pitfalls of classical methods (Fang et al., 1 Aug 2025).

5. Learning, Estimation, and Integration of Domain Knowledge

SCMs can be estimated from data using constraint-based (PC, FCI) or score-based methods, sometimes with the explicit encoding of prior knowledge. The Causal Knowledge Hierarchy (CKH) methodology (Adib et al., 2022) proposes hierarchical weighting of expert judgments, data-driven structure, and literature evidence, integrating these sources via convex weights and maximizing a confidence-weighted orientation objective. Sensitivity analysis demonstrates robustness to mis-specified priors.

Bayesian inference schemes, including latent-variable SCMS (Learning Latent SCMs (Subramanian et al., 2022)), leverage variational approximations, permutation-invariant priors, and emission models (often neural network–based) for joint inference over causal variables, structure, and parameters—supporting robust out-of-distribution prediction.

6. Limitations, Generalizations, and Practical Modeling Issues

Modeling Functional Laws and Equilibrium Behavior

Standard SCMs prove inadequate for certain equilibrium dynamical systems, notably when initial conditions persistently influence the stationary distribution (e.g., in enzyme kinetics). Causal Constraints Models (CCMs) generalize SCMs by encoding arbitrary collections of intervention-specific constraints, including conservation laws and functional equations (Blom et al., 2018).

Cycles, Latent Confounders, and Marginalization

Cyclic SCMs present challenges—nonuniqueness, failures of standard conditional-independence properties, and potential pathologies in marginalization. The simple-SCM subclass, characterized by unique solvability on every subset, preserves most acyclic SCM properties and supports algorithms for structure learning, intervention, and counterfactual analysis (Bongers et al., 2016).

Computational Complexity and Approximate Inference

Causal inference in SCMs is NP-hard even for polytree graphs with latent variables (Zaffalon et al., 2020). Approximate inference approaches, including EM algorithms, credal network mapping, and LP relaxations, provide practical solutions for counterfactual reasoning, bounding effects when identifiability fails (Zaffalon et al., 2020, Zaffalon et al., 2020).

7. Applications, Impact, and Ongoing Research Directions

SCMs underpin structural discovery in fields as diverse as neuroimaging (SCMs for MR images (Reinhold et al., 2021)), geometric deep learning for shape models (structural causal mesh generation (Rasal et al., 2022)), hydrological extremes, climate-impact modeling, and autonomous vehicle system design (Howard et al., 3 Jun 2024). They enable both model-based and data-driven causal learning, provide a grounding for explanation systems (SCE (Zečević et al., 2021)), and connect to Markov processes for principled counterfactual inference in complex biomolecular systems (Ness et al., 2019).

Open questions involve inference under selection bias (conditioning operations (Chen et al., 12 Jan 2024)), handling dynamic populations, scalable algorithms for high-dimensional extremes, and integration of SCMs with modular, learning-based components (Howard et al., 3 Jun 2024).

For further details on specific technical results, refer to (Bongers et al., 2016, Fang et al., 1 Aug 2025, Engelke et al., 9 Mar 2025, Zaffalon et al., 2020, Zaffalon et al., 2020, Ormaniec et al., 17 Jun 2024, Adib et al., 2022), and (Blom et al., 2018). These works collectively articulate the mathematical foundations, extensions, and practical computations central to modern SCM research.