Temporal Graph Modelling
- Temporal Graph Modelling is a framework to represent and learn from networks that evolve over time via continuous events or discrete snapshots.
- It employs diverse techniques such as neural ODEs, RNNs, and hierarchical pooling to capture fine-grained temporal dynamics and improve predictive accuracy.
- Evaluation protocols focus on dynamic link prediction, node property forecasting, and generative modeling to validate temporal reasoning and performance.
Searching arXiv for recent and foundational papers on Temporal Graph Modelling to ground the article. Temporal Graph Modelling (TGM) denotes the study of how to represent, learn from, query, and generate graphs whose topology and/or attributes evolve over time. In the surveyed formulation, temporal graphs appear either as continuous-time event streams, , or as discrete snapshot sequences, , where each aggregates events in a time window (Gupta et al., 2022). More generally, a finite, discrete Time-Varying Graph may be written as with , so that a dynamic edge links a vertex-time pair to (Wehmuth et al., 2014). Across these formalisms, TGM addresses non-stationarity, fine- versus coarse-grained dynamics, inductive generalization, scalability, and evaluation, while supplying the mathematical substrate for temporal reasoning, forecasting, anomaly detection, pattern matching, and generative modeling (Gupta et al., 2022).
1. Formal representations and problem settings
A central distinction in TGM is between continuous-time interaction data and discrete-time graph sequences. In continuous-time settings, a temporal graph is modeled as timestamped interactions such as , or, in event-based form, , where both link start times and link durations are retained (Guo et al., 2022, Zhang et al., 2023). In discrete-time settings, the graph is represented as a sequence of snapshots 0, with node features 1 and edge features 2 (Maheshwari et al., 2024).
The most general representation in the provided material is the Time-Varying Graph model 3, with 4 (Wehmuth et al., 2014). This formulation classifies dynamic edges into spatial edges, temporal edges, mixed edges, and spatial-temporal self-loops, depending on endpoint and timestamp coincidence. By defining the temporal-node set 5, each dynamic edge becomes an ordinary directed edge between temporal nodes, which yields an isomorphism to a static directed graph on 6 (Wehmuth et al., 2014). The same source states that the model can represent snapshot sequences, interval-labeled edges, spatial-plus-temporal edge models, and temporal-plus-mixed edge models, and can intrinsically model cyclic behavior through regressive edges (Wehmuth et al., 2014).
Problem formulations in TGM vary with the downstream task. Dynamic link prediction treats the input as a continuous-time temporal graph 7 and asks the model to score whether a link 8 will occur at time 9 (Huang et al., 2023). Dynamic node property prediction, described in TGB as node-affinity prediction, seeks an output vector 0 over a target set 1 for each source node 2 and time 3 (Huang et al., 2023). Temporal Knowledge Graph Forecasting uses timestamped quadruples 4 and predicts a missing object for a future-time query (Chang et al., 21 Jan 2025). Spatio-temporal forecasting instead assumes a static graph 5 with time-indexed node signals 6 and masks 7 (Bilal et al., 17 Jan 2025). Pattern matching over temporal graphs introduces temporal basic graph patterns, where a static pattern is paired with a temporal constraint 8, possibly specified by a timed automaton (Aghasadeghi et al., 2022).
This diversity of formulations suggests that TGM is not a single model family but a shared mathematical domain. A plausible implication is that unification efforts in TGM are driven as much by representation choice as by learning architecture.
2. Temporal structure: snapshots, events, continuity, and hierarchy
A core methodological divide in TGM is between snapshot-based, event-stream, continuous-depth, and hierarchical formulations. The survey identifies streaming/event-stream methods such as JODIE, DyRep, HTNE, TGN, and TGAT, and snapshot/discrete methods such as DynGEM, EvolveGCN, and DySAT (Gupta et al., 2022). EvolveGCN evolves layer weights with an RNN, 9, while TGAT uses time encodings in neighborhood attention (Gupta et al., 2022).
Continuous Temporal Graph Networks (CTGNs) replace discrete updates with a Neural-ODE in node-representation space, assuming that each node embedding 0 evolves according to
1
for 2, where 3 denotes the neighborhood at the link-start time 4 (Guo et al., 2022). In this formulation, link start times enter the GNN aggregator through a continuous time encoder, and link duration 5 becomes the upper limit of ODE integration (Guo et al., 2022). The same source states that many dynamic graph networks can be viewed as specific discretizations of CTGNs (Guo et al., 2022).
TimeGraphs proposes a different structural response to temporal heterogeneity. Rather than processing all timesteps uniformly, it converts a sequence of scene graphs 6 into a hierarchical temporal knowledge graph
7
where 8 is the base level and 9 are successively coarser levels of events (Maheshwari et al., 2024). At each level, supernodes are formed via an assignment matrix 0, supernode features are aggregated from constituent lower-level nodes, and the coarsened adjacency is 1 (Maheshwari et al., 2024). The paper explicitly characterizes this as converting a non-uniformly evolving sequence of scene or interaction graphs into a single, adaptive, multi-scale graph (Maheshwari et al., 2024).
Other works introduce alternative temporal structure assumptions. TREND uses a Hawkes-process view of link formation and models “exciting effects” through a conditional intensity 2 combining a base rate and decayed contributions from prior events (Wen et al., 2022). TGNE embeds nodes as piece-wise linear trajectories of Gaussian distributions in latent space, so that continuous-time sparsity is handled through uncertainty estimates in the posterior variances 3 (Romero et al., 2024). The source-separation approach to computer networks assumes that each observed adjacency matrix decomposes as
4
or, in factorized form, 5, where only the mixing coefficients vary with time (Larroche, 2023).
These formulations target different temporal phenomena: arbitrary event times and durations, self-excitation, non-uniform temporal salience, seasonal variation, or uncertainty under sparsity. This suggests that temporal structure in TGM is often chosen to mirror the dominant regularity in the application domain rather than to satisfy a single canonical formalism.
3. Learning architectures and inference mechanisms
Neural architectures in TGM typically combine temporal encoding, graph aggregation, and task-specific decoding. A common continuous-time node update can be written as
6
which is cited in the survey as a generic form for continuous-time methods (Gupta et al., 2022).
TimeGraphs constructs its hierarchy through vertex infomax pooling (VIPool) within a Graph Cross Network backbone. At each scale 7, VIPool selects a subset 8 maximizing a mutual-information-based criterion, and a greedy approximation with negative sampling makes this tractable in 9 per level (Maheshwari et al., 2024). After hierarchy construction, a relational GCN updates node states across relation types and across hierarchy levels: 0 with optional attention weights 1 (Maheshwari et al., 2024).
TGN-style models instead maintain node memories. In the TGN formalism summarized in the TGNv2 paper, each event triggers message construction, message aggregation, and memory update: 2 followed by embedding computation from memories and neighbor features (Tjandra et al., 2024). The TGNv2 contribution is to augment each message with source-target identification, using encoded node indices 3, so that the model can represent persistent forecasting, moving averages, and linear autoregressive functions of past messages (Tjandra et al., 2024). The paper states that no instantiation of TGN can represent the moving average of order 4, whereas TGNv2 can exactly represent persistent forecasting, moving averages of any order 5, and any linear autoregressive model of order 6, on temporal graphs with bounded 7 (Tjandra et al., 2024).
TGSL addresses incomplete and noisy graph structure by learning additional temporal edges. It first computes edge embeddings with an edge-centric time-aware GNN, then uses an RNN over a node’s recent interacted neighbors to produce a time-aware context vector 8, samples a small candidate pool, warps embeddings to a sampled target time, scores candidate edges through inner products, and uses Gumbel-Top-K edge selection to add plausible edges (Zhang et al., 2023). The total loss combines task loss on the original graph, task loss on the augmented graph, and a contrastive regularizer (Zhang et al., 2023).
TGPM is described as a “pattern-centric” framework. It constructs interaction patches using temporally-biased random walks rooted at a target node, converts each walk into a token embedding through a small Transformer, mean-pools tokens into a patch embedding, and feeds sequences of patch embeddings into a time-aware Transformer with a learned time bias matrix 9 (Ma et al., 30 Jan 2026). Pre-training uses masked token modeling and next-time prediction (Ma et al., 30 Jan 2026).
TG-ODE targets irregularly sampled and partially observed networked dynamical systems. Hidden states evolve between observations through a Graph Neural ODE, and at each observation time an imputed state, a reliability matrix 0, and a Graph-GRU update are combined to form the next hidden state (Zou et al., 2024). T-GMM, by contrast, is a discrete-time architecture for spatio-temporal forecasting that combines node-level processing, patch-level subgraph encoding, and a three-dimensional MLP-Mixer that mixes across patch tokens, time tokens, and feature channels (Bilal et al., 17 Jan 2025).
The architectural landscape therefore spans RNN-controlled GNNs, temporal attention, Neural-ODEs, Hawkes-process intensities, hierarchical pooling, Transformer encoders, contrastive graph structure learning, and MLP mixers. A plausible implication is that TGM research differentiates models less by the mere presence of temporal information than by where temporal inductive bias is inserted: in memory, in the message function, in the latent dynamics, in structure learning, or in the representation hierarchy.
4. Querying, prediction, and generation
TGM supports predictive, inferential, and generative tasks. For predictive tasks, dynamic link prediction and node-affinity prediction are standard. TGB formalizes dynamic link prediction as ranking the true destination among negative candidates and evaluates it with filtered MRR (Huang et al., 2023). Dynamic node property prediction is evaluated with NDCG@K (Huang et al., 2023). The TGB paper reports that on node tasks, persistence is best on UNTrade and moving average is best on LastFM, Subreddit, and tgbn-token, while TGN is second best on LastFM (Huang et al., 2023). The TGNv2 paper is explicitly motivated by this failure mode, stating that heuristic approaches such as persistent forecasts and moving averages over ground-truth labels significantly and consistently outperform TGNs on dynamic node affinity prediction (Tjandra et al., 2024).
For temporal reasoning over scene graphs, TimeGraphs reports event prediction and recognition on Football, Resistance, and MOMA (Maheshwari et al., 2024). For temporal knowledge graphs, TGL-LLM integrates a temporal graph learning module into an LLM-based forecasting pipeline, using RGCN plus GRU to generate historical graph embeddings, a hybrid graph tokenization to inject graph information into prompts, and a two-stage fine-tuning paradigm for graph-language alignment (Chang et al., 21 Jan 2025).
Pattern-querying is developed in the timed-automata framework for temporal graph patterns. A timed automaton 1 is used to define temporal constraints over edge variables in a temporal basic graph pattern, and three evaluation schemes are provided: a baseline two-phase algorithm, an on-demand incremental algorithm, and a fully incremental partial-match algorithm (Aghasadeghi et al., 2022). The same source states that timed automata subsume standard existential temporal motifs while expressing non-existential alternation, contiguity, mandated response or timeout patterns, mutual exclusions, set-containment constraints, and arbitrary Boolean combinations of clock-gaps (Aghasadeghi et al., 2022).
Generative modeling constitutes another major branch of TGM. TIGGER treats a temporal interaction graph as a corpus of temporal random walks and learns an auto-regressive factorization
2
with a structural decoder for the next node and an intensity-free temporal point-process decoder for inter-arrival times via a mixture of log-normal components (Gupta et al., 2022). The source states that TIGGER has both transductive and inductive variants and avoids node identity leakage in the inductive setting (Gupta et al., 2022).
MTM instead models temporal graph generation as a motif transition process. A motif 3 transitions to 4 when a new event attaches to it, and transition probabilities and transition rates are estimated from the source graph (Liu et al., 2023). Synthetic generation proceeds through cold-event generation followed by hot-event simulation, where motif extensions are sampled according to transition probabilities and exponential waiting times parameterized by estimated rates (Liu et al., 2023). TTERGM offers a statistical generative model for social networks by extending TERGMs with explicit triangle counts, two-path closures, and three-path closures, together with a dyadic prior encoding social-learning effects (Huang et al., 2022).
This range of tasks shows that TGM is not limited to embeddings for downstream classifiers. It also includes grammars of temporal constraints, explicit probabilistic graph evolution models, and generative mechanisms aimed at preserving structural and temporal statistics.
5. Evaluation protocols, datasets, and empirical findings
Evaluation in TGM is strongly shaped by benchmark design. TGB provides nine large, real-world temporal-graph datasets, chronologically split into train, validation, and test with no leakage, covering edge-level dynamic link prediction and node-level dynamic node property prediction (Huang et al., 2023). For the link task, the benchmark uses filtered MRR with fixed negative sets sampled partly from hard historical negatives and partly uniformly from the current node set; for the node task, it uses NDCG@K (Huang et al., 2023). TGB further emphasizes that model performance varies drastically across datasets and that simple methods often achieve superior performance compared to existing temporal graph models on dynamic node property prediction (Huang et al., 2023).
TimeGraphs evaluates on Football, Resistance, and MOMA. On Football and Resistance, TimeGraphs-E2E obtains 0.794/0.762/0.838/0.702 and 0.685/0.688/0.684/0.681 for 5precision/recall/exact match, corresponding to relative EM gains of 6 and 7, respectively (Maheshwari et al., 2024). On MOMA, TimeGraphs reaches 95.3 activity mAP, 69.5 sub-activity mAP, 35.8 atomic classification mAP, and 44.1 localization mAP (Maheshwari et al., 2024). The same paper reports that in the Resistance task, early-round predictions with only 10 sec of history already achieve 8 accuracy, rising to 9 at 60 sec, and that on Football, predicting 0 seconds ahead yields 1 at 2 down to 3 at 4 (Maheshwari et al., 2024).
CTGN reports results on event-based datasets with durations, including Netflix, Mooc, and Lastfm, as well as contact-sequence datasets Wikipedia and Reddit (Guo et al., 2022). The paper reports, for example, Netflix transductive AP 5 for CTGN versus 6 for TGN, and Mooc inductive AP 7 versus 8 (Guo et al., 2022). TREND reports gains over static, snapshot-based, continuous-time, and Hawkes-based baselines, including over 9 F1 on Taobao and 0 F1 on citation data (Wen et al., 2022).
TGSL evaluates on Wikipedia, Reddit, and Escorts. On Wikipedia transductive, TGAT* improves from ACC 1 and AP 2 to ACC 3 and AP 4 when combined with TGSL; on Wikipedia inductive, it improves from ACC 5 and AP 6 to ACC 7 and AP 8 (Zhang et al., 2023). The same source states that removing the supervised loss on the augmented graph loses approximately 9 points ACC, and that using learned edges at inference gains approximately 0 points ACC (Zhang et al., 2023).
TGNE evaluates temporal network reconstruction with AUC on simulated SBM, HighSchool, MIT Reality Mining, Workplace, and UCI online communication. Reported test AUCs include 1 on HighSchool, 2 on Workplace, and 3 on UCI, exceeding the test AUC of LSDM on each of these datasets (Romero et al., 2024). The same source states that uncertainty estimates align with time-varying degree distribution and that uncertainty ablations show larger prior scale 4 yields higher overall 5 and stronger correlation between event counts and uncertainty (Romero et al., 2024).
The generative papers use distinct metrics. TIGGER evaluates duplication and fidelity over graph statistics such as mean degree, wedge count, triangle count, power-law exponent, clustering coefficient, mean betweenness, and mean closeness, and reports up to 6 speedups over prior temporal generators (Gupta et al., 2022). MTM measures preservation of global-structural metrics, global-temporal metrics, local motif metrics, and runtime, with reported CPU times such as 7 s on Email-EU and 8 s on CollegeMsg, compared with much larger times for TASBM, STM, TagGen, and explicit motif counting (Liu et al., 2023). TTERGM reports mean absolute error reductions over block model and classic TERGM on GitHub network data, with TTERGM achieving 9, 00, 01, and 02 on the four reported in-degree and out-degree prediction settings (Huang et al., 2022).
6. Recurring limitations, controversies, and directions of development
Several limitations recur across the literature. The survey notes that many approaches are transductive, weak in temporal modeling, or limited in scalability, and that node-ID leakage and one-to-one mappings prevent up- or down-scaling in temporal graph generation (Gupta et al., 2022). CTGN notes that ODE-solver overhead can exceed discrete layers for very tight tolerances or highly stiff dynamics, and that the architecture of 03 and the time encoder 04 must be chosen carefully for stability (Guo et al., 2022). TGNE is explicitly transductive with respect to nodes, uses fixed changepoints, and suffers from rotational identifiability if the prior is weak (Romero et al., 2024). TG-ODE is motivated by the difficulty of capturing spatial and temporal dependencies in irregularly sampled and partially observable graph time series using standard Neural ODE or RNN-based approaches (Zou et al., 2024). T-GMM reports that memory use becomes prohibitive on larger graphs such as PV-US due to overlapping patches (Bilal et al., 17 Jan 2025).
A specific controversy in TGM concerns the strength of simple heuristics. TGB reports that persistence and moving average baselines outperform existing temporal graph models on several dynamic node property tasks (Huang et al., 2023). TGNv2 formalizes this issue by proving that no instantiation of TGN can represent the moving average of order 05, and by showing empirically that moving averages over messages outperform TGN and current temporal graph models on dynamic node affinity prediction (Tjandra et al., 2024). This is not merely an optimization issue; in the formulation of TGNv2, it is a representational limitation caused by permutation invariance with respect to sender and receiver identities (Tjandra et al., 2024).
Another recurring theme is unification. The Time-Varying Graph model is described as a unifying representation for finite discrete dynamic networks (Wehmuth et al., 2014). CTGN presents continuous-depth modeling as a generalization in which discrete temporal GNNs emerge as Euler or Runge–Kutta discretizations (Guo et al., 2022). The survey frames TGM itself as encompassing both representation learning and generative modeling (Gupta et al., 2022). More recently, the TGM library is presented as the first framework to unify CTDG and DTDG methods, with event-stream storage underlying both paradigms and time-granularity conversion implemented through a fully vectorized discretization operator 06 (Chmura et al., 8 Oct 2025). That library reports an average 07 speedup over DyGLib in end-to-end training and an average 08 speedup in graph discretization (Chmura et al., 8 Oct 2025).
Several papers also indicate future directions. CTGN lists multi-relation or knowledge-graph forecasting, adaptive solvers, and combining CTGN with hierarchical event batching (Guo et al., 2022). TGSL points toward learned temporal edge augmentation integrated with temporal graph encoders (Zhang et al., 2023). TGPM emphasizes cross-domain transfer and self-supervised pre-training over interaction patches (Ma et al., 30 Jan 2026). TGL-LLM suggests deeper integration of temporal graph encoders with LLMs through graph tokenization and modality alignment (Chang et al., 21 Jan 2025). The survey highlights heterogeneous temporal graphs, causal inference, long-range dependencies, deep motif-aware modeling, robust inductive generation, and unified benchmarks as open problems (Gupta et al., 2022).
Taken together, these works indicate that TGM has evolved from time-stamped extensions of static graph learning into a field concerned with temporal expressivity, continuous versus discrete semantics, hierarchy, uncertainty, efficient evaluation, and cross-task unification. A plausible implication is that future progress in TGM will depend less on isolated architectural novelty than on reconciling these dimensions within representations and evaluation protocols that remain faithful to real temporal graph processes.