Variational Temporal GNN Explainer

Updated 27 December 2025

The paper introduces VA-TGExplainer, which employs variational masks to quantify the temporal importance of edges in TGNN-based intrusion detection.
The methodology uses reparameterized variational inference and ELBO optimization, showing that removing key edges can reduce anomaly scores by up to 78%.
The framework preserves temporal-causal structures and provides uncertainty estimates that enhance interpretability and trust for SOC analysts in forensic analysis.

A Variational Temporal Graph Explainer (VA-TGExplainer) is an explanation module designed for Temporal Graph Neural Network (TGNN)–based intrusion detection systems (IDS) operating on provenance graphs. It quantifies the importance of edges in a temporal context for high anomaly score predictions, outputs distributions over edge importances instead of deterministic masks, and explicitly models uncertainty in subgraph-based explanations. VA-TGExplainer operates downstream of TGNN link-prediction heads, providing fine-grained, uncertainty-aware explanations compatible with complex, time-evolving graph structures such as those derived from system audit data (Dhanuka et al., 20 Dec 2025).

1. Mathematical Formalism

Let $G = (V, E, T)$ denote a temporal provenance graph, with $V$ a node set (e.g., processes, files, sockets), $E \subseteq V \times V \times T$ a set of directed, timestamped edges representing system events, and $T$ the set of discrete timestamps. For a target event (anomalous edge) $y$ in graph $G$ , the associated TGNN $f_\theta$ provides a link prediction score, with anomaly scores derived from the negative log-likelihood of the event label: $\ell(G; \theta) = -\log P_\theta(y|G)$ .

An explanation context subgraph $S \subseteq E$ is formed by selecting edges within a specific time window (e.g., 15 minutes) or k-hop causal neighborhoods relevant to $y$ , preserving event temporal order.

The explainer introduces a vector of random mask variables $Z = (z_i)_{i \in S}$ , where each $z_i \in [0,1]$ scales edge $i$ 's presence in the masked graph $G \odot Z$ . Masks are sampled from a variational posterior

$q_\phi(Z|G, y) = \prod_{i \in S} q_\phi(z_i)$

with $z_i = \sigma(\mu_i + \exp(\frac{1}{2}\log \sigma_i^2) \epsilon_i)$ , $\epsilon_i \sim \mathcal{N}(0,1)$ and $\sigma$ the logistic sigmoid. The mask variables are parameterized by $\phi = \{\mu_i, \log \sigma_i^2\}$ , enabling continuous edge-importance estimation and uncertainty quantification.

2. Objective and Training Procedure

VA-TGExplainer maximizes a variational evidence lower bound (ELBO) for $P_\theta(y|G)$ with respect to $\phi$ : $\text{ELBO}(\phi) = \mathbb{E}_{Z \sim q_\phi} [ \log P_\theta(y | G, Z) ] - \mathrm{KL}(q_\phi(Z|G, y) \| p(Z)) - \lambda_{\text{sp}} \cdot \Omega(\mathbb{E}_{Z\sim q_\phi}[Z])$ where $p(Z)$ is an elementwise prior (e.g., iid normal in logit space), $\mathrm{KL}$ the closed-form divergence, and $\Omega(\cdot)$ a sparsity penalty (sum or norm over mask means). $\lambda_{\text{sp}}$ controls the sparsity–fidelity trade-off. The main loss is implemented with cross-entropy for TGNN link prediction, augmented by KL-divergence and sparsity regularization.

The training loop for a target event involves:

Sampling edge mask variables $Z$ via reparameterization per epoch
Forming a masked graph $G \odot Z$
Evaluating loss components: data likelihood (via frozen TGNN decoder), KL penalty, sparsity
Gradient update (Adam optimizer) on $\phi$ using the composite loss Inference uses the posterior mean $p_i = \sigma(\mu_i)$ as the edge importance score, with the variance parameter $\sigma_i^2$ quantifying uncertainty.

3. Temporal and Causal Structure Preservation

In alignment with provenance analysis requirements, the context subgraph $S$ for explanation construction is defined by the TGNN’s sliding window (e.g., 15 minutes) or causal neighborhood. The masking operation preserves temporal ordering: masks remove (or downweight) edges but do not permute or retime events. This design maintains the correspondence between explanation and the temporal-causal relationships critical for forensic analysis.

4. Edge Importance and Uncertainty Quantification

VA-TGExplainer outputs, for each edge $i$ in $S$ , both the mean inclusion probability $p_i$ and an uncertainty measure, such as $\operatorname{Var}(z_i) \approx \sigma(\mu_i) (1 - \sigma(\mu_i)) \frac{\exp(\log \sigma_i^2)}{1 + \exp(\log \sigma_i^2)}$ , or empirically via sampling. High mean, low variance edges are considered definite contributors; high-variance edges are interpreted as tentative, highlighting model uncertainty in their explanatory relevance.

Explanations can be thresholded, e.g., with $p_i > 0.7$ for inclusion. JSON reports for each edge typically contain fields: source, destination, relation type, mean importance, and variance—enabling downstream interpretation.

5. Architecture, Implementation, and Computational Characteristics

The VA-TGExplainer encoder employs a two-layer multilayer perceptron (MLP), with input features given by the concatenation of source/destination TGNN node embeddings ( $h_u$ , $h_v$ ) and the original edge features $x_e$ , outputting $\mu_i$ and $\log \sigma_i^2$ . The decoder is the frozen TGNN link prediction head, ensuring explanations are contextualized to the pre-trained model’s predictions. Adam optimization, with standard hyperparameters ( $\text{lr}=0.01$ , $\lambda_{\mathrm{KL}}=1e-3$ , $\lambda_{\mathrm{sp}}=1e-3$ ) and 200 epochs per event, is typical. Relaxation is achieved via logistic-normal reparameterization; Gumbel-softmax is not required.

Hardware requirements are moderate—single GPU operation with up to approximately 5k edges in $S$ ; fallback to CPU is triggered if less than 500 MB GPU memory remains. The evaluation of runtime and mask statistics on DARPA CADETS windows yields $3$–$5$ s overhead per event.

Method	Time/event	Avg Mask Size	Comprehensiveness	Sufficiency
GraphMask	1.8 s	~10 edges	0.90	0.08
GNNExplainer	2.5 s	~5 edges	0.82	0.15
VA-TGExplainer	3.8 s	~5 edges	0.84	0.12

6. Empirical Evaluation and Comparative Metrics

Empirical evaluation on the DARPA CADETS dataset demonstrates that VA-TGExplainer preserves high fidelity to the TGNN’s decision process. Ablation by removing the top-3 mean-mask edges lowers anomaly scores by 78% on average. Masks with 3–5 edges yield comprehensiveness above 0.8, and expansion to approximately 8 edges increases comprehensiveness beyond 0.9, but with diminishing returns.

Uncertainty quantification differentiates VA-TGExplainer: edges with variance below 0.02 are consistent across random restarts, while those above 0.1 fluctuate in approximately 30% of runs. This feature supports communication of explanation confidence to end-users such as SOC analysts.

Comparative baselines indicate:

GraphMask produces larger, global window-level subgraphs (~10 edges), highest fidelity (0.90), and rapid computation, but lacks uncertainty measures.
GNNExplainer delivers localized, event-specific masks and standard fidelity, without uncertainty quantification.
VA-TGExplainer achieves similar mask compactness and slightly higher comprehensiveness than GNNExplainer, with the added benefit of modeling multiple plausible explanations and reporting explanation uncertainty (Dhanuka et al., 20 Dec 2025).

7. Application and Significance in SOC Analysis

VA-TGExplainer is architected for integration with post-hoc explainability frameworks such as PROVEX and temporal graph IDS like KAIROS, with general applicability to other temporal graph-based detection systems. By providing human-interpretable, uncertainty-aware explanations that highlight key causal subgraphs, VA-TGExplainer is designed to enhance SOC analyst trust, improve incident triage speed, and support provenance-based threat forensics. Its ability to quantify edge-level uncertainty in explanations has particular value under ambiguous, adversarial, or noisy attack scenarios, offering a principled approach to calibrating explanation confidence in large-scale, rapidly evolving cyber environments (Dhanuka et al., 20 Dec 2025).

PDF Markdown Chat (Pro)

References (1)

PROVEX: Enhancing SOC Analyst Trust with Explainable Provenance-Based IDS (2025)

Whiteboard

Generate a whiteboard explanation of this topic.

Topic to Video (Beta)

Generate a video overview of this topic.

Follow Topic

Get notified by email when new papers are published related to Variational Temporal GNN Explainer (VA-TGExplainer).

Variational Temporal GNN Explainer

1. Mathematical Formalism

2. Objective and Training Procedure

3. Temporal and Causal Structure Preservation

4. Edge Importance and Uncertainty Quantification

5. Architecture, Implementation, and Computational Characteristics

6. Empirical Evaluation and Comparative Metrics

7. Application and Significance in SOC Analysis

Whiteboard

Topic to Video (Beta)

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research

Variational Temporal GNN Explainer

1. Mathematical Formalism

2. Objective and Training Procedure

3. Temporal and Causal Structure Preservation

4. Edge Importance and Uncertainty Quantification

5. Architecture, Implementation, and Computational Characteristics

6. Empirical Evaluation and Comparative Metrics

7. Application and Significance in SOC Analysis

Sponsor

Whiteboard

Topic to Video (Beta)

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research