Summary Causal Graphs (SCGs)
- Summary Causal Graphs (SCGs) are compact DAGs that abstract intricate system details to highlight essential causal relationships among high-level variables.
- They support causal inference by employing d-separation and do-calculus to determine valid adjustment sets and estimate causal effects from observational data.
- SCGs streamline model construction in fields like structural engineering by systematically integrating domain expertise into a practical graphical framework.
A Summary Causal Graph (SCG) is a compact graphical abstraction designed to encode the essential causal structure among high-level variables, often by suppressing fine-grained mechanistic details. In the formal framework established for structural engineering and generalized causal modeling, an SCG is a directed acyclic graph (DAG) whose nodes represent the principal random variables of interest and edges denote direct causal effects as inferred from domain knowledge. The central purpose of an SCG is to enable the identification and estimation of causal effects, particularly by supporting rigorous procedures grounded in the local Markov property, d-separation, and do-calculus. By providing such a reduced representation, SCGs facilitate both graphical reasoning and practical model construction, especially in domains where experimental design is constrained or data is limited (Naser, 2023).
1. Definition and Mathematical Formalism
A Summary Causal Graph is formally defined as a DAG , where each node denotes a random variable, and each directed edge encodes a presumed direct causal influence of on . The notion of "summary" in this context means that only high-level relationships critical for causal inference are retained, while intricate system details are abstracted away. For any acyclic SCG, the joint probability distribution factorizes according to the Markov property: where denotes the set of parent nodes of in (Naser, 2023).
2. d-Separation and Identification of Causal Effects
d-Separation serves as the principal criterion for inferring conditional independencies from the SCG structure. Given any path between two nodes and , a set of nodes is said to block the path under two conditions:
- The path contains a chain or fork ( or ) and .
- The path contains a collider () but and no descendant of is in .
If all paths between and are blocked by according to these rules, is d-separated from given , which implies a conditional independence in any distribution compatible with (Naser, 2023). This graphical criterion enables the determination of which variables must be controlled for (i.e., included in an adjustment set) to consistently estimate a causal effect.
3. Construction Workflow for SCGs
The development of an SCG from domain expertise is systematic and aligned with the established causal modeling methodology:
- Specify the causal query: Clearly define the effect to be identified (e.g., the impact of beam depth on deflection).
- List candidate variables: Enumerate all exposures, outcomes, suspected confounders, mediators, colliders, moderators, and instrumental variables.
- Allocate causal arrows: For each pair , include if intervening on would affect under current scientific understanding.
- Label structural motifs: Explicitly identify confounders (W causing both and ), mediators ( on pathways), and colliders ( with ).
- Prune and summarize: Remove nodes extraneous to exposure-outcome association unless they block critical paths; merge less influential variables to maintain graph compactness.
- Verify d-separation and adjustment sets: Apply the back-door criterion to isolate sets such that no element of is a descendant of and all back-door paths are blocked. Front-door or instrumental variable strategies may be invoked if back-door adjustment fails (Naser, 2023).
4. Functional Identification via do-Calculus
Pearl's do-calculus provides a logically complete set of transformation rules to convert interventional probabilities into functions of observational probabilities. For an SCG, these rules—matched to graphical criteria after selective mutilation (removal of edges)—support the identifiability of causal estimands under observational data. Classic cases include:
- Back-door adjustment: when a valid adjustment set exists.
- Front-door adjustment and instrumental variable strategies: Invoked when adjustment fails or unmeasured confounders intervene. These approaches can, under specific structures, still express via observable quantities (Naser, 2023).
5. Example: Flood-Zone Confounding in Civil Engineering
An illustrative domain application involves the analysis of structural failures in metropolitan flood zones. Variables include
- : strict zoning code (potential confounder),
- : number of structures (exposure),
- : number of failures (outcome).
Domain knowledge specifies the SCG: The adjustment procedure identifies as a confounder; the back-door path is blocked by conditioning on . Accordingly, the interventional effect is computed as: (Naser, 2023)
6. Interpretive and Practical Guidance
SCGs graphically clarify the types of variables and causal relations to avoid errors such as conditioning on colliders. They highlight available identification strategies for causal effects, making the selection of adjustment sets transparent. In fields such as civil engineering, where randomized experiments are impractical, SCGs encode the minimal scientific assumptions required to derive robust, causally meaningful design guidelines directly from observational data.
7. Role of SCGs in Causal Modeling and Research Acceleration
In structural engineering and other applied sciences, SCGs enable practitioners to move beyond correlation-based inference, ensuring that statistical associations mapped in regression models translate to valid causal conclusions. By formalizing qualitative scientific expertise into a DAG structure, SCGs facilitate the systematic identification and estimation of causal parameters, thereby expediting both research and implementation of best practices (Naser, 2023).
An SCG, in essence, distills qualitative causal knowledge into a quantitatively actionable graphical object that parameterizes the joint distribution, supports d-separation reasoning, and enables the full machinery of causal inference via do-calculus, as articulated in the foundational work on causal diagrams for structural engineering (Naser, 2023).