Causal Structure Learning in Hawkes Processes with Complex Latent Confounder Networks (2508.11727v1)

Published 15 Aug 2025 in cs.LG and stat.ML

Abstract: Multivariate Hawkes process provides a powerful framework for modeling temporal dependencies and event-driven interactions in complex systems. While existing methods primarily focus on uncovering causal structures among observed subprocesses, real-world systems are often only partially observed, with latent subprocesses posing significant challenges. In this paper, we show that continuous-time event sequences can be represented by a discrete-time model as the time interval shrinks, and we leverage this insight to establish necessary and sufficient conditions for identifying latent subprocesses and the causal influences. Accordingly, we propose a two-phase iterative algorithm that alternates between inferring causal relationships among discovered subprocesses and uncovering new latent subprocesses, guided by path-based conditions that guarantee identifiability. Experiments on both synthetic and real-world datasets show that our method effectively recovers causal structures despite the presence of latent subprocesses.

Summary

The paper's main contribution is a two-phase iterative framework that robustly identifies both observed and latent causal relationships in multivariate Hawkes processes.
The methodology leverages discrete-time approximations, rank constraints, and path-based symmetry conditions to infer causal structures amid latent confounders.
Experimental results on synthetic and real-world datasets demonstrate the framework's superior performance over existing methods in complex partially observed systems.

Causal Structure Learning in Hawkes Processes with Complex Latent Confounder Networks

Introduction

The paper "Causal Structure Learning in Hawkes Processes with Complex Latent Confounder Networks" (2508.11727) focuses on addressing the challenges in causal discovery within multivariate Hawkes processes, particularly in the presence of complex latent confounder networks. The authors recognize that many existing methods for inferring causal structures in Hawkes processes assume full observability of subprocesses, which is often an unrealistic assumption in real-world scenarios where some processes remain latent.

To handle this, the paper introduces a novel two-phase iterative framework designed to identify both observed and latent subprocesses and infer causal influences between them. The approach leverages rank constraints and path-based conditions within discretized data to achieve identifiability of causal structures, even when latent subprocesses act as confounders.

Multivariate Hawkes Process and Model Definition

Multivariate Hawkes processes are defined as self-exciting point processes capable of capturing temporal dependencies through intensity functions. Each subprocess within the model interacts through a defined excitation function, which measures dependencies among events. The authors propose a Partially Observed Multivariate Hawkes Process-based Causal Model (PO-MHP), which accommodates both observed and latent subprocesses in a directed causal graph.

The model representation is grounded in a discrete-time approximation, which transforms the continuous-time dynamics of Hawkes processes into a linear autoregressive form. This transformation allows for more straightforward analysis and inference by utilizing cross-covariance statistics of discretized event data.

Identifying Latent Subprocesses and Causal Influences

The critical contribution of the paper is the methodology for identifying latent confounder subprocesses and their causal influences in the network. By adopting path-based symmetry conditions and leveraging the inherent linear model properties of discretized Hawkes processes, the framework establishes rank-deficiency conditions that indicate the presence of latent subprocesses.

A significant portion of the paper discusses the theoretical underpinnings of these conditions, demonstrating how observed rank patterns support the inference of both observed and latent causal relations. This approach enables the identification of latent confounders when their observed effects reflect surplus rank-one contributions in the data.

Algorithmic Implementation

The proposed algorithm operates in two main phases. In Phase I, the focus is on inferring causal relationships among the observed subprocesses, using rank conditions to constrain potential parent set candidates. In Phase II, the framework systematically uncovers latent confounder subprocesses through pattern-based analysis of observed variable interactions.

Each phase iteratively refines the causal graph by alternately leveraging discovered relationships and identifying new latent subprocesses for further analysis. The authors present detailed proofs demonstrating the identifiability guarantees of their approach, noting the robustness and reliability of this method in complex partially observed systems.

Experimental Evaluation

The paper's experimental validation includes tests on synthetic datasets, curated to reflect various structural configurations of Hawkes processes, both fully and partially observed. These experiments illustrate the efficacy of the method when benchmarked against existing techniques like Structural Hawkes Processes (SHP) and Rank-based Learning.

Through scenarios illustrating larger causal graphs, latent confounder intricacies, and adjustment to conditions like discretization granularity, the method consistently outshone baseline models. Furthermore, the algorithm's adaptability is demonstrated on a real-world cellular network alarm dataset, showcasing practical applicability and alignment with domain-expert validated ground truth.

Conclusion

This research presents a robust framework for discovering and learning causal structures within multivariate Hawkes processes. It effectively tackles the challenges posed by latent subprocesses by leveraging discrete-time representations and novel rank-based identifiability mechanisms. The method not only extends the boundaries of causal inference in partially observed networks but also provides a systematic approach applicable to various time-series datasets with potentially unobserved components.

Future research directions may focus on broader applications to other types of event-driven temporal data and exploring relaxation criteria to address scenarios where full identifiability may not be achievable.