PC and GES Algorithms for Causal Discovery

Updated 7 December 2025

PC and GES are foundational algorithms for causal discovery, using constraint-based tests and score optimization to recover the Markov equivalence class (CPDAG) from observational data.
They employ methods like conditional independence testing with the PC algorithm and greedy BIC score search in GES to structure and orient edges in Bayesian networks.
Extensions integrating hybrid methods, domain knowledge, and nonparametric transformations enhance robustness and performance in various high-dimensional and non-Gaussian settings.

PC and GES are foundational algorithms for learning the directed acyclic graph (DAG) structure of Bayesian networks and general causal models from observational data. PC (“Peter–Clark”) is the canonical constraint-based approach, while GES (“Greedy Equivalence Search”) is the canonical score-based (and, in some settings, hybrid) method. Both algorithms search for the Markov equivalence class (MEC) of the true DAG, representable by a completed partially directed acyclic graph (CPDAG). They serve as the backbone for both classical and modern approaches in high-dimensional, non-Gaussian, and domain-augmented causal structure discovery.

1. Algorithmic Formulations

PC Algorithm

PC is a constraint-based algorithm operating on conditional independence (CI) relations, usually inferred using partial correlations for continuous data. The skeleton discovery phase iteratively removes edges for variable pairs $X_i, X_j$ upon failing a conditional-independence test $X_i\perp X_j \mid S$ for some conditioning set $S$ ; this is typically quantified by the Fisher $z$ -transform of the sample partial correlation:

$z = \frac{1}{2}\ln\frac{1+\hat{r}_{ij\cdot S}}{1-\hat{r}_{ij\cdot S}}$

The null $H_0\!:\!r_{ij\cdot S}=0$ is rejected if $|z|>\mathrm{Z}_{1-\alpha/2}$ . After skeleton identification, Meek’s orientation rules (R1–R4) propagate edge directions to produce a maximally oriented CPDAG (Ramsey, 2015, Michel et al., 23 Oct 2025, Nandy et al., 2015).

GES Algorithm

GES searches the CPDAG space via greedy score optimization. It employs a two-phase strategy: Forward Equivalence Search (FES) adds edges incrementally to improve a decomposable score (commonly the BIC), followed by Backward Equivalence Search (BES) that prunes unnecessary edges. The BIC score for DAG $G$ is:

$\mathrm{BIC}(G)=\sum_{i=1}^p \left[\ell_i(\hat{\theta}_i|\mathrm{Pa}_G(X_i))-\frac{|\theta_i|}{2}\ln n\right]$

This process is guaranteed to be locally consistent if the scoring function penalizes complexity appropriately and the local optima reflect true CI structure (Nandy et al., 2015, Shen et al., 2022, Ramsey, 2015).

2. Theoretical Guarantees and Consistency

In the classical (fixed $p$ , $n\to\infty$ ) regime, GES is consistent under mild assumptions: DAG-perfectness, unique CPDAG, and a consistent, score-equivalent penalized likelihood (e.g., BIC). For high-dimensional sparse graphs (both $p_n$ and $n$ grow), consistency of GES and hybrid ARGES is assured under additional constraints: strong faithfulness (bounded partial correlations), sparse maximum degree $q_n=O(n^{1-b_1})$ , and reliable skeleton/CIG estimation. The ARGES variants restrict the search space, mitigating computational costs and maintaining reachability of the true CPDAG (Nandy et al., 2015).

PC achieves consistency under strong-faithfulness and oracle independence tests (exact partial correlations), matching the assumptions required for GES in the high-dimensional case (Nandy et al., 2015). However, GES/ARGES yield a valid CPDAG by construction, whereas PC can output undecided edges or violate acyclicity unless augmented with additional rules.

3. Extensions, Hybridizations, and Domain Integration

PC-GES Hybrid

The PC-GES hybrid uses the PC skeleton as a search restriction for GES, efficiently combining edge elimination (PC) with score-based edge orientation (GES). This can enhance orientation recovery under strong nonlinearities or copula violation regimes (Ramsey, 2015).

Domain Knowledge and LLM-Augmented PC/GES

Integrating domain constraints via financial knowledge graphs (KG) and LLMs significantly improves graph recovery in domain-intensive settings. FinCARE encodes required/forbidden edges and confidence weights into PC via adaptive $\alpha_{ij}$ -levels and into GES as KG-regularized scores:

$\mathrm{Score}_{\rm KG}(G) = \mathrm{BIC}(G) + \lambda_{\mathrm{kg}} R_{\mathrm{KG}}(G)$

where $R_{\mathrm{KG}}$ rewards/penalizes domain-mandated/prohibited edges. LLM outputs can serve as priors or confidence modifiers, further protecting theoretically important edges and compensating for weak statistical signal. Empirical findings show relative F1 improvements for PC (+36%) and GES (+100%) when enhanced with KG+LLM (Michel et al., 23 Oct 2025).

4. Nonparanormal Transform and Nonparametric Generalizations

Nonparanormal Transformation

To address non-Gaussianity, the nonparanormal (npn) transform applies univariate empirical normal score transforms to each margin, generating Gaussian marginals while preserving the copula. This preprocessing is beneficial for GES-BIC and PC in moderate non-Gaussian, mildly nonlinear regimes, substantially lowering false positive rates and improving recall (by factors of 2–4 for GES-BIC in NG1/NL1), but is “harmless but largely ineffective” outside these settings (Ramsey, 2015).

Nonparametric GES via Neural Conditional Dependence

GES can be reformulated to use any $\tau$ -consistent conditional dependence measure, not just a penalized likelihood. Neural Conditional Dependence (NCD) leverages deep networks to estimate population-level partial associations:

$S(X,Y|Z) = \sup_{f,g} \rho^2(f(X,Z)-h^*(Z), g(Y,Z)-\ell^*(Z))$

with neural net parameters trained to maximize empirical squared correlations post-orthogonalization. The reframed GES tracks these local CI decisions, achieving large-sample optimality under the standard causal Markov and faithfulness assumptions. Experiments show NCD-GES outperforms both standard GES+BIC and kernel-based scores, with F1 and SHD improvements especially pronounced on nonlinear and misspecified data (Shen et al., 2022).

5. Practical Performance and Regime-Dependent Guidelines

Simulation studies across both small (p=50, n=1000) and high-dimensional (p=500, n=250) regimes demonstrate nuanced tradeoffs:

Scenario/Setting	PC (adj/arrow) FPR	GES-BIC Recall	GES-BIC + npn	PC-GES Performance
Linear+Gaussian	~0.01-0.02	~0.83-0.90	≈raw	Low FPR (~0.05), small RR drop
Mild NL+NG1+npn	0.23→0.05	0.71→0.79	0.67→0.17	Lowest FPR under NL2
Strong NL ()	-	-	≈raw	Outperforms PC, GES on arrows

AIC penalization is unsuitable in high $p$ , small $n$ , where doubling the BIC penalty is effective. Domain-augmented (KG/LLM) algorithms show yield improvements in recall and precision, especially in settings with weak empirical signal or substantial prior knowledge (Ramsey, 2015, Michel et al., 23 Oct 2025).

6. Limitations, Assumptions, and Contemporary Variants

Both PC and GES-type search inherit the strong-faithfulness requirement—nonvanishing minimal partial correlations—which is critical for both theoretical guarantees and empirical reliability (Nandy et al., 2015). For GES, greedy hill climbing may miss globally optimal CPDAGs, and performance is sensitive to the penalty parameter and search space initialization. The success of KG/LLM-domain enhancements depends on the quality of knowledge extraction and task-appropriate regularization. NCD-rich GES offers a flexible path toward robust nonparametric structure learning but introduces additional computational overhead and requires careful training of the neural components (Shen et al., 2022).

7. Implementation and Empirical Recommendations

For moderate-size and density ( $p \lesssim 300$ ), vanilla GES with BIC/extended BIC is recommended. For large, sparse graphs, hybrid search-space restriction (ARGES-CIG) using neighborhood selection or adaptive LASSO substantially reduces computation while maintaining statistical consistency. PC is suitable for obtaining rough skeletons rapidly, but its CPDAG output can violate essential acyclicity or completeness constraints. R and Python packages such as pcalg (GES, ARGES, PC), bnlearn (MMHC), huge (neighborhood selection), and glmnet (LASSO) support these methods (Nandy et al., 2015).

Augmentations involving KGs/LLMs are practically valuable in domains with reliable prior sources. In high-noise or nonparametric settings, NCD-GES and similar nonparametric CI-aware algorithms dominate traditional parametric approaches. Penalty tuning (e.g., extended BIC, stability selection) and hybridization (PC-GES, ARGES) further refine performance across practical graphical causal inference tasks.