Characterization and Learning of Causal Graphs from Hard Interventions (2505.01037v1)

Published 2 May 2025 in stat.ML and cs.LG

Abstract: A fundamental challenge in the empirical sciences involves uncovering causal structure through observation and experimentation. Causal discovery entails linking the conditional independence (CI) invariances in observational data to their corresponding graphical constraints via d-separation. In this paper, we consider a general setting where we have access to data from multiple experimental distributions resulting from hard interventions, as well as potentially from an observational distribution. By comparing different interventional distributions, we propose a set of graphical constraints that are fundamentally linked to Pearl's do-calculus within the framework of hard interventions. These graphical constraints associate each graphical structure with a set of interventional distributions that are consistent with the rules of do-calculus. We characterize the interventional equivalence class of causal graphs with latent variables and introduce a graphical representation that can be used to determine whether two causal graphs are interventionally equivalent, i.e., whether they are associated with the same family of hard interventional distributions, where the elements of the family are indistinguishable using the invariances from do-calculus. We also propose a learning algorithm to integrate multiple datasets from hard interventions, introducing new orientation rules. The learning objective is a tuple of augmented graphs which entails a set of causal graphs. We also prove the soundness of the proposed algorithm.

Summary

The paper introduces a novel framework leveraging hard interventions and generalized do-constraints to accurately uncover causal structures with latent confounders.
It establishes augmented pair graphs and twin augmented MAGs as key tools for determining I-Markov equivalence under hard interventions.
The proposed FCI-inspired algorithm efficiently learns graph structure through conditional independence tests within domains and equality tests across interventional data.

The paper "Characterization and Learning of Causal Graphs from Hard Interventions" (2505.01037) addresses the fundamental problem of discovering the underlying causal structure of a system using experimental data, specifically focusing on hard interventions. Unlike soft interventions, which modify causal mechanisms, hard interventions fix a variable's value, effectively breaking its incoming causal links. The authors argue that hard interventions can be more informative for causal discovery, particularly in the presence of latent variables (unobserved confounders), compared to soft interventions.

The core contribution lies in developing a framework to leverage multiple hard interventional datasets, potentially combined with observational data, to learn causal graphs with latent variables. Prior work on interventional causal discovery often focused on settings with causal sufficiency (no latents) or soft interventions, leaving the comprehensive use of hard interventions in the latent variable setting underexplored.

The paper's approach is built upon a generalization of Pearl's do-calculus [pearl1995causal]. While standard do-calculus provides rules to infer properties of interventional distributions from a causal graph, the authors propose using the converse of these rules to infer graphical structure from observed invariances (equalities) or differences (inequalities) across different distributions (observational and various hard interventional). This leads to a new set of "do-constraints" that link distributional properties to specific graphical features like open backdoor paths.

To represent the causal graphs and test these new constraints, the authors introduce augmented graph structures:

Augmented Pair Graph: For any pair of interventions $I$ and $J$ , an augmented graph is constructed by taking two copies of the observable variables ( $V^{(I)}$ and $V^{(J)}$ ), adding edges corresponding to the original graph with interventions $do(I)$ and $do(J)$ applied (removing incoming edges to $I$ and $J$ ), and introducing an auxiliary F-node connected to variables in the symmetric difference $I \Delta J$ . This structure allows m-separation statements in the augmented graph to represent the distributional invariances tested by the generalized do-calculus rules (Proposition 1).
Twin Augmented MAG: Derived from the Maximal Ancestral Graph (MAG) of the Augmented Pair Graph, the Twin Augmented MAG symmetrizes the adjacencies between the F-node and the variable copies ( $V^{(I)}, V^{(J)}$ ). This novel graphical object is proven to provide a necessary and sufficient condition for two causal graphs (with latents) to be I-Markov equivalent under a given set of hard interventions (Theorem 1). Two graphs are I-Markov equivalent if they entail the same set of hard interventional distributions that satisfy the proposed I-Markov property (Definition 1, based on generalized do-calculus).
I-augmented MAG Tuple: To provide a more compact and unified representation of the equivalence class across all interventions, the authors propose the I-augmented MAG tuple. This tuple contains one graph ( $Aug_I(D, I)$ ) for each intervention target $I$ in the set $\mathcal{I}$ . Each $Aug_I(D, I)$ focuses on the graph structure under $do(I)$ and includes F-nodes connecting to other domains $J \in \mathcal{I} \setminus \{I\}$ . The I-Markov equivalence of two graphs is shown to be equivalent to the pairwise graphical equivalence of their corresponding I-augmented MAG tuples (Proposition 2). This tuple serves as the target structure for the learning algorithm.

Based on this characterization, the paper proposes Algorithm 1, an FCI-inspired procedure for learning the I-augmented graph tuple from multiple hard-interventional datasets. The algorithm proceeds in three phases:

Initialization: Create initial complete graphs for each domain ( $V^{(I)}$ ) and add F-nodes connected to all observable variables.
Skeleton Learning: Identify separating sets for pairs of variables within domains (using conditional independence tests within a single interventional distribution $P_I$ ) and between F-nodes and observable variables (using equality tests between $P_I$ and $P_J$ ). Remove edges based on discovered separations.
Orientation Rules: Apply a set of rules to orient edges. This includes standard FCI rules (adapted for the augmented graph structure) and four new rules specifically designed for the F-nodes and the multi-domain setting under hard interventions (Rules 8-11). These rules leverage the properties of hard interventions (e.g., intervened variables have no parents in the interventional graph) and the relationships encoded by the F-nodes across domains.

The algorithm is proven to be sound (Theorem 2), meaning that any adjacency or directed edge (arrowhead/tail) inferred by the algorithm is guaranteed to be present in all I-augmented MAGs of any causal graph that is I-Markov equivalent to the true graph.

Practical Implementation and Application:

Implementing this research involves several key steps and considerations:

Data Collection: Requires collecting data from a set of hard interventions $\mathcal{I}$ . The choice of intervention targets significantly impacts the identifiability of the causal structure. The paper suggests that hard interventions can be more informative than soft ones in certain latent variable scenarios (demonstrated empirically).
Conditional Independence Testing: The core of the algorithm relies on reliably testing conditional independence relations within single distributions ( $P_I(y|w, z) = P_I(y|w)$ ) and equality/inequality across distributions ( $P_I(y|w) = P_J(y|w)$ ). The choice of CI test depends on the data type (e.g., Gaussian, discrete) and distribution assumptions (e.g., parametric, non-parametric). Robust CI testing in high dimensions and with limited data is a known challenge in causal discovery.
Graph Representation: Implementing the augmented graphs requires data structures capable of representing directed, bidirected, and partially oriented edges (circle marks). Libraries for causal graphical models (like causal-learn in Python) might provide a starting point for graph manipulation and standard FCI rules, but would need extensions for the augmented nodes and specific orientation rules for hard interventions.
Algorithm Implementation: The algorithm follows the constraint-based paradigm similar to FCI. Implementing the three phases involves iterating through potential separating sets (which can be computationally expensive, though optimizations like testing smaller sets first are common) and applying the orientation rules iteratively until convergence. The added F-nodes and multiple domains increase the size and complexity of the graphs being processed compared to standard FCI.
Computational Requirements: The complexity of FCI-like algorithms can be significant, particularly in the worst case where graphs are dense or separating sets are large. The introduction of multiple domains and F-nodes further increases the number of nodes and potential edges in the graphs being learned, potentially increasing runtime and memory requirements.
Handling Latent Variables: The framework explicitly handles latent variables through the use of MAGs and PAGs concepts, which are represented in the domain-specific parts of the I-augmented graphs. The bidirected edges represent latent confounding.
Faithfulness Assumption: The soundness proof relies on the h-faithfulness assumption (Definition 3), which postulates that all relevant dependencies and independencies in the distribution tuple are precisely reflected as d-separation and m-separation statements in the true causal graph. This assumption is standard in constraint-based causal discovery but may not hold exactly in real-world data.
Empirical Evaluation: The experiments compare the size of the I-Markov equivalence class under hard vs. soft interventions by enumerating or sampling ADMGs. The results show that hard interventions lead to smaller equivalence classes on average, suggesting better identifiability. This implies that for practitioners, using hard interventions might yield a more specific causal graph estimate.

Potential Applications:

The methods are applicable in domains where controlled hard interventions are feasible, such as:

Computer Systems: Intervening on system parameters, disabling components, or fixing input values (as mentioned in [wang2023fault] for microsystem architectures). Hard interventions can be precisely controlled.
Engineered Systems: Testing components or subsystems by setting specific inputs to fixed values.
Synthetic Biology/Chemistry: In highly controlled lab settings, it might be possible to fix concentrations or knock out gene expression entirely (though biological interventions are often closer to soft interventions).

Limitations and Future Work:

The paper acknowledges limitations, including the potential incompleteness of the proposed learning algorithm (it may leave some edges unoriented even if they are oriented in all I-augmented MAGs of the equivalence class, as discussed in the appendix). Future work could focus on developing complete orientation rules, exploring optimal experimental design strategies for hard interventions, and potentially combining insights from both hard and soft interventions.

In summary, this paper provides a rigorous framework for causal discovery from multiple hard interventions in the presence of latent variables. It introduces novel theoretical tools (generalized do-calculus constraints, augmented graphs) and a sound learning algorithm. The practical application requires careful consideration of data collection, robust independence testing, and computational resources, but offers the potential for more accurate causal structure learning compared to using soft interventions alone in certain scenarios.

PDF Markdown

Tweets

https://twitter.com/StatMLPapers/status/1919241807294034014

Characterization and Learning of Causal Graphs from Hard Interventions (2505.01037v1)

Summary

Related Papers

Tweets