Chain Event Graphs

Updated 10 September 2025

Chain Event Graphs are probabilistic graphical models that merge isomorphic event tree substructures to capture context-specific conditional independences.
They facilitate efficient probability propagation and factorization along variable-length paths using a dedicated message-passing algorithm.
CEGs support advanced decision analysis, diagnostics, and dynamic modeling applicable in domains such as medicine, forensics, and spatial studies.

A Chain Event Graph (CEG) is a class of probabilistic graphical model specifically constructed to efficiently represent conditional independences and stochastic mechanisms in processes where the underlying state space is highly asymmetric. Unlike Bayesian Networks (BNs), which require a fixed set of random variables and enforce a product structure on the sample space, CEGs encode asymmetries and context-specific conditional independences directly in the graph topology. CEGs are constructed from event trees by merging vertices (situations) with isomorphic future subtrees—positions—resulting in a compact and expressive model well suited to sequential, non-symmetric, and context-dependent phenomena.

1. Structural Foundations and Construction

A CEG is built from an event tree, whose non-leaf vertices—called situations—represent points of uncertainty in the process, and leaf-to-root paths (atoms) correspond to complete event sequences. Vertices in the event tree whose emanating edge probabilities and subtrees are identically distributed and structurally identical are merged into positions. The resulting CEG has as vertices these positions, which partition the set of non-leaf vertices of the event tree, thereby capturing the different "contexts" or partial histories of the process. Edges are directed and carry transition probabilities, while a unique sink node collects all terminal trajectories.

In the CEG, positions are thus context-equivalence classes: every root-to-leaf path that passes through a position represents a set of realizations that are statistically indistinguishable given their past. The construction leverages two principles:

Vertex grouping by subtree isomorphism: Vertices are merged into the same position if their subtrees are topologically isomorphic and their transition probabilities correspond via a simple mapping.
Edge specification: The edges reflect possible transitions, indexed by the positions, with associated probability vectors.

A "transporter CEG" is a variant subgraph of the CEG, retaining only the directed edges that encode the main stochastic structure, analogous to the triangulated BN in the junction tree algorithm, and is particularly useful for probability propagation (Thwaites et al., 2012).

2. Probability Propagation and Factorization

The CEG supports efficient inference via a message passing algorithm, designed to propagate and update probabilities upon observation of new evidence. This algorithm consists of a backward collect step followed by a forward distribute step over the transporter CEG:

Backward pass: For each position $w$ , compute the "emphasis" (or potential)

$\phi(w) = \sum_{e(w,w') \in E(w)} T(e(w'|w))$

where $T(e(w'|w))$ is the transition probability of moving from $w$ to $w'$ via edge $e$ .

Forward update: Given new evidence (compatible with a set of edges $E_A$ ), revise the edge probabilities as

$\hat{T}(e(w'|w)) = \begin{cases} 0, & \text{if } e(w, w') \notin E_A \ \frac{T(e(w'|w)) \phi(w')}{\phi(w)}, & \text{if } e(w, w') \in E_A \end{cases}$

normalizing so that the updated probability vector at $w$ is still valid.

Factorization: The joint probability for a complete path $X=(w_0, e_1, w_1, \ldots, e_n, w_n)$ is

$P(X) = \prod_{i=1}^n T(e_i)$

where each $T(e_i)$ is the transition probability associated with the edge traversed at the corresponding position.

CEG factorization is not over fixed product spaces as in BNs, but along the (possibly variable length) root-to-sink paths, exploiting the positions and the event tree structure for high efficiency in asymmetric scenarios (Thwaites et al., 2012).

3. Conditional Independence and the Separation Theorem

CEGs encode context-specific conditional independence via their topology. The separation theorem for simple (uncoloured) CEGs (sCEGs) gives a graphical criterion, analogous to $d$ ‐separation in BNs, for reading off independence statements:

Let $w_1$ and $w_2$ be positions with $w_2$ not preceding $w_1$ . Then,

$X(w_1) \amalg X(w_2) ~~ \Longleftrightarrow ~~\left(\exists\, w\,\text{cut-vertex with } w_1 \prec w \prec w_2\right) \,\vee\, \left(w_2 \text{ is a cut-vertex}\right)$

A cut-vertex is a vertex (other than the root or sink) that all root-to-sink paths traverse. The existence of a cut-vertex between two positions "blocks" the influence between them; conditional independence can thus be read directly from the presence or absence of such topological features (Thwaites et al., 2015).

This design allows CEGs to express a much richer class of conditional independence relations—often context-specific and asymmetric—than BNs, and permits local reconsideration of independence structure using graph-theoretic arguments.

4. Decision Analysis and Dynamic Modelling

CEGs are particularly well suited for decision analysis in asymmetric processes. When extended to encode chance and decision nodes, they support efficient computation of optimal policies through local message-passing. For a given position $w_i$ :

If $w_i$ is a chance node:

$w_i[u] = \sum_{w \in ch(w_i)} \left[ \sum_{e(w_i,w)} e(w_i,w)[p] \cdot (w[u] + e(w_i,w)[u]) \right]$

If $w_i$ is a decision node:

$w_i[u] = \max_{w \in ch(w_i)} \left[ \max_{e(w_i,w)} (w[u] + e(w_i,w)[u]) \right]$

After backward propagation, the root position(s) encode the maximal expected utility and permit the identification of optimal decision strategies (Thwaites et al., 2015).

Dynamic extensions include the N Time-slice Dynamic Chain Event Graph (NT-DCEG), which is constructed from recursively combined finite event tree objects and is closely related to time-homogeneous Markov processes. The NT-DCEG explicitly accounts for time-homogeneity and periodicity, enables object-oriented model construction, and supports reading of context-specific independence directly from the graph (Collazo et al., 2018, Collazo et al., 2018). Continuous time extensions (CT-DCEG) further generalize this paradigm to semi-Markov processes with arbitrary holding times, allowing both transition and holding time distributions to be modelled and updated via propagation algorithms (Shenvi et al., 2020).

5. Model Selection, Diagnostics, and Software

CEGs are amenable to Bayesian and algorithmic learning approaches. Staged trees, colored by transition symmetries, provide the basis for clustering of situations into stages, and model selection typically relies on maximizing the marginal likelihood over possible groupings. Algorithms such as Agglomerative Hierarchical Clustering (AHC) or mixture-modelling approaches (for non-conjugate settings) enable scalable search for optimal stage partitions (Shenvi et al., 2022). Bayesian model averaging can be used for robust inference, quantifying model uncertainty and identifying features common to many plausible CEG structures (Strong et al., 2022).

Diagnostics for CEGs mirror and extend those for BNs, with global, staging, position, and situation monitors—based on predictive sequential (prequential) log-scores and Bayes factors—used to assess goodness-of-fit and structural adequacy as data accumulate (Wilkerson et al., 2019).

Recent software developments include Python (cegpy) and R (stCEG) packages for construction, visualization, probability propagation, evidence updating, and spatial integration of CEG models, supporting both algorithmic and expert-driven model specification (Walley et al., 2022, Calley et al., 9 Jul 2025). These toolkits allow both data-driven learning and interactive model customisation.

6. Applications and Extensions

CEGs have demonstrated utility across varied domains where asymmetric and context-specific processes are fundamental. Examples include medical treatment regimes (more compact and interpretable CEGs compared to BNs), forensic science (explicit mapping of competing causal narratives and evaluation of evidence via likelihood ratios), migration pathway modeling (Bayesian inference and model selection integrating ABM logic), decision analysis for dynamic or sequential processes, and spatial analysis of crime or public health data (Thwaites et al., 2012, Robertson et al., 3 Apr 2024, Strong et al., 2021, Calley et al., 9 Jul 2025).

Causal inference in CEGs is formalized through the definition of manipulation (including remedial intervention), causal algebras, and adaptations of the back-door theorem to the event-tree setting, enabling identification and estimation of intervention effects even under partial observability (Yu et al., 2022). Dynamic and continuous-time extensions enhance expressiveness for event-history and time-to-event processes.

7. Theoretical Significance and Comparative Perspective

CEGs generalize and unify several strands in probabilistic graphical modeling:

They recover BNs as a special case but remove the requirement for a fixed variable structure and product space.
They allow for efficient encoding and inference in highly asymmetric and context-specific domains via topology-driven factorization and message passing.
The explicit separation theorem and precise propagation rules provide a robust foundation for both theoretical and practical work in conditional independence, evidence updating, and dynamic modeling.
A plausible implication is that, for many sequential, heterogeneous, or process-oriented phenomena, CEGs can yield more parsimonious and interpretable models, particularly where BNs or classical Markov models are unwieldy or insufficient.

The ongoing development of scalable algorithms and increasing software support is expanding the reach of CEGs in both research and applied settings.