Temporal Graph Sequential Recommender (TGSRec)
- TGSRec is a model that integrates temporal graphs, attention mechanisms, and collaborative signals to predict future user-item interactions.
- It employs continuous-time bipartite graph construction and learnable temporal embeddings to efficiently capture evolving user preferences and item dynamics.
- The transformer-based TCT layer with multi-head attention yields significant gains in recall and MRR over previous recommendation methods.
The Temporal Graph Sequential Recommender (TGSRec) is a model class for sequential recommendation tasks that combines temporal graph structures, attention-based deep learning, and explicit modeling of temporal collaborative and sequential patterns. TGSRec was introduced to address the challenge of jointly capturing evolving user preferences, complex item dynamics, and collaborative signals within temporally structured user-item interaction data. The TGSRec framework includes the construction of continuous-time bipartite graphs, dedicated time and collaborative-aware embeddings, and a specialized transformer architecture, yielding state-of-the-art performance for recommending items at arbitrary future timestamps (Fan et al., 2021).
1. Problem Setting and Graph Construction
TGSRec formulates the sequential recommendation problem as learning to predict future user-item interactions, where user behavior evolves over continuous time. The interaction data are modeled as a continuous-time bipartite graph (CTBG)
with user set , item set , and edge set , where each edge is an interaction signifying that user interacted with item at timestamp . For each user , the model must, at any query time , rank all items (the set of items not yet interacted with by up to ), so that the actual item selected at is highly ranked. This continuous-time, inductive setting requires models to both memorize long-term evolving interests and generalize to unseen query times.
2. Temporal Embeddings and Representation Mechanisms
In TGSRec, each user and item is associated with a learnable long-term embedding vector,
stored in a joint embedding table . Time is encoded by mapping any real-valued to a vector using a learnable multi-frequency trigonometric kernel: where the frequencies are learned. This encoding, via Bochner's theorem, yields a translation-invariant kernel that quantifies temporal similarity.
3. Temporal Collaborative Transformer (TCT) Architecture
The TGSRec core architecture is the Temporal Collaborative Transformer (TCT) layer, which generalizes self-attention to jointly encode:
- Sequential patterns in user-item trajectories;
- Temporal proximity via explicit time-kernelization;
- Collaborative signals via query–key composition of user and item embeddings.
Each TCT layer receives, for every node (user or item) at time , a temporal embedding concatenated with its time encoding: For a query user at time , a fixed number of their past interactions are sampled as neighbors. Attention is computed via: where , is the stack of , and is the stack of . This structure captures both collaborative affinity and temporal proximity . The aggregated message is given by
and fused with the query into an updated embedding via a two-layer feed-forward network. Multi-head and stacked layers support higher-order temporal and collaborative dependencies.
4. Prediction Layer, Training Objectives, and Optimization
Recommendation at time involves ranking items according to the bilinear similarity between the final temporal user and item embeddings: Losses are computed using either the Bayesian Personalized Ranking (BPR) loss,
or a binary cross-entropy loss. For each positive interaction, a negative item (not yet seen by by time ) is sampled. Model parameters are updated via mini-batch Adam optimization, with on-the-fly negative sampling and standard regularization.
5. Experimental Protocol and Results
Empirical validation of TGSRec (Fan et al., 2021) was conducted on five datasets: four Amazon categories (“Toys”, “Baby”, “Tools”, “Music”) and MovieLens-100K. All datasets are split chronologically (80%/10%/10%) and have high sparsity (e.g., “Toys”: density, mean inter-event interval 85 days). Baselines included static collaborative filtering (BPR, LightGCN), temporal graph models (CTDNE), RNN-based and graph-based sequential models (FPMC, GRU4Rec, Caser, SR-GNN), and Transformer-based methods (SASRec, BERT4Rec, SSE-PT, TiSASRec). Evaluation metrics comprised Recall@10, Recall@20, MRR, and NDCG@10, ranking each target item among 1,000 randomly sampled negatives per test query using Krichene & Rendle's unbiased estimator.
Across all datasets, TGSRec yielded large absolute improvements over strong baselines. Averaged over five sets, TGSRec achieved 22.5% absolute gain in Recall@10 and 22.1% in MRR compared to the best previous model. For example, on “Toys”, TGSRec achieved Recall@10=0.3650 and MRR=0.3661, exceeding SASRec's Recall@10=0.1452 and MRR=0.0732. Ablation studies confirmed that both the learned time kernel and collaborative attention are critical; stacking two TCT layers provides further improvements over a single layer.
6. Key Insights and Significance
TGSRec demonstrates that fusing sequential modeling, collaborative filtering, and dedicated continuous-time mechanisms yields substantial gains in sequential recommendation accuracy. The model's structure allows:
- Simultaneous encoding of dynamic user preferences and collaborative signals;
- Robust performance in sparse, irregular, and highly dynamic environments;
- Generalization to arbitrary query timestamps, not limited to observed discrete intervals.
Ablation results highlight the necessity of (a) learned temporal embeddings, (b) collaborative attention in the TCT, and (c) stacking attention layers for hierarchical aggregation. Removing any component significantly degrades performance. These findings confirm that merely applying Transformer-style architectures to timestamped graphs is insufficient without explicit temporal and collaborative integration.
7. Relationship to Broader Research and Extensions
TGSRec’s principle of unifying temporal, sequential, and collaborative modeling in inductive, graph-based architectures is consistent with trends in sequential recommendation and dynamic graph learning. Contemporary methods such as Time-Guided Graph Neural ODEs (TGODE) introduce adaptive time-aware augmentation and joint ODE-driven graph evolution to further close the gap with irregular and long-term drift in real data (Fu et al., 23 Nov 2025). TGODE, while related, utilizes a diffusion-based graph augmenter and continuous graph ODE, whereas TGSRec’s innovation centers on attention-based encoding within temporal graphs.
TGSRec constitutes a general, flexible, and scalable solution for sequential recommendation in temporally rich domains, providing a blueprint for subsequent models and analyses targeting fine-grained, dynamically evolving user-item interaction systems.