Papers
Topics
Authors
Recent
Search
2000 character limit reached

RoTE: Coarse-to-Fine Multi-Level Rotary Time Embedding for Sequential Recommendation

Published 15 Apr 2026 in cs.IR | (2604.13389v1)

Abstract: Sequential recommendation models have been widely adopted for modeling user behavior. Existing approaches typically construct user interaction sequences by sorting items according to timestamps and then model user preferences from historical behaviors. While effective, such a process only considers the order of temporal information but overlooks the actual time spans between interactions, resulting in a coarse representation of users' temporal dynamics and limiting the model's ability to capture long-term and short-term interest evolution. To address this limitation, we propose RoTE, a novel multi-level temporal embedding module that explicitly models time span information in sequential recommendation. RoTE decomposes each interaction timestamp into multiple temporal granularities, ranging from coarse to fine, and incorporates the resulting temporal representations into item embeddings. This design enables models to capture heterogeneous temporal patterns and better perceive temporal distances among user interactions during sequence modeling. RoTE is a lightweight, plug-and-play module that can be seamlessly integrated into existing Transformer-based sequential recommendation models without modifying their backbone architectures. We apply RoTE to several representative models and conduct extensive experiments on three public benchmarks. Experimental results demonstrate that RoTE consistently enhances the corresponding backbone models, achieving up to a 20.11% improvement in NDCG@5, which confirms the effectiveness and generality of the proposed approach. Our code is available at https://github.com/XiaoLongtaoo/RoTE.

Summary

  • The paper introduces RoTE, which decomposes timestamps into year, month, and day components and incorporates them using a rotary embedding approach for enhanced temporal modeling.
  • The method achieves significant performance gains, such as a +20.11% improvement in NDCG@5 and +17.51% in Recall@5, while adding minimal computational overhead.
  • RoTE's plug-and-play design integrates directly with Transformer-based models, offering robust improvements in capturing long- and short-term user preferences.

Coarse-to-Fine Multi-Level Rotary Time Embedding for Sequential Recommendation

Introduction

The paper "RoTE: Coarse-to-Fine Multi-Level Rotary Time Embedding for Sequential Recommendation" (2604.13389) addresses fundamental limitations in temporal modeling within sequential recommender system frameworks, particularly those based on Transformer architectures. Standard sequential recommendation approaches rely heavily on order-preserving positional embeddings, which neglect the non-uniform and heterogeneous temporal intervals between user interactions. This reduces temporal dynamics to a coarse, order-only signal and inhibits the model's capacity to capture nuanced user preference evolution across multi-scale temporal gaps.

The RoTE module introduces a hierarchical approach to timestamp encoding, decomposing interaction time into year, month, and day components, and leveraging these granularities in a rotary embedding framework. This method enables attention mechanisms to perceive and exploit temporal distances and multi-level temporal patterns without requiring architectural changes to backbone models.

Methodology

Problem Definition

Sequential recommendation is formalized as p(iLuโˆฃSu)p(i_{L_u} \mid \mathcal{S}_u), where Su\mathcal{S}_u denotes a userโ€™s chronologically ordered interaction history with items and corresponding timestamps. Accurate modeling requires capturing dependencies and evolution within such sequences, taking into account irregular inter-event intervals.

Temporal Feature Construction

RoTE decomposes Unix timestamps for each interaction into three ordinal calendar components: years (yky_k), months (mkm_k), and days (dkd_k) since the epoch. This structured triplet preserves chronological order, aligns more closely with natural human temporal reasoning, and creates a foundation for multi-scale temporal modeling.

Rotary Time Embedding (RoTE)

RoTE injects temporal signals into the query and key vectors of the Transformerโ€™s multi-head self-attention mechanism by applying rotary transformations at each temporal granularity. Specifically:

  • For each temporal level lโˆˆ{y,m,d}l \in \{y, m, d\}, rotation angles ฮธk(l)\boldsymbol{\theta}_k^{(l)} are computed using a fixed inverse frequency spectrum controlled by level-specific base scalars.
  • Query and key vectors undergo 2D rotational transformations over each even-odd dimension pair, encoding time-awareness directly in the angular relationships used for computing attention scores.
  • Three temporal representations per interaction are obtained (year, month, day), and combined via a weighted fusion: ฮฑy\alpha_y, ฮฑm\alpha_m, ฮฑd\alpha_d emphasize long-term preferences and short-term dynamics respectively.

RoTE does not modify the value representations, the core attention formulation, or training objectives; it is entirely plug-and-play with negligible computational overhead.

Experimental Evaluation

Datasets and Baselines

Experiments are conducted on three Amazon Reviews datasets (Sports and Outdoors, Beauty, Toys and Games, using 5-core splits and fixed sequence length preprocessing), spanning both traditional (GRU4Rec, Caser, SASRec, BERT4Rec) and generative (VQRec, TIGER, HSTU, RPG) sequential recommendation paradigms. RoTE modules are integrated into SASRec (traditional) and RPG (generative) as representative backbones.

Numerical Results

RoTE yields consistent enhancements across all metrics and models. Notably:

  • On Toys and Games, RoTE improves RPG's NDCG@5 by +20.11%, and Recall@5 by +17.51%, demonstrating strong performance advantages.
  • Statistically significant improvements are observed via paired Su\mathcal{S}_u0-tests (Su\mathcal{S}_u1).

Ablation studies show incremental gains when introducing structured calendar components (year, month, day) compared to pure timestamp rotary encoding or positional-only baselines. The finest granularity (Y+M+D) delivers maximal benefit.

Efficiency Analysis

RoTE introduces only marginal increases in FLOPs (e.g., SASRec: +110K) and inference latency (SASRec: +0.7ms; RPG: +1.9ms), with a slight reduction in parameter count due to the elimination of the positional embedding table. These results validate the practical applicability of RoTE for real-time recommender scenarios.

Implications and Future Directions

Theoretical implications of RoTE suggest that multi-level temporal encoding strengthens the inductive bias within the attention mechanism for temporal reasoning, allowing for more accurate modeling of preference drift and multi-scale dynamics. Practically, RoTEโ€™s plug-and-play design offers immediate integration potential for a wide range of sequential models, facilitating performance upgrades with minimal system cost.

Future research may investigate:

  • Expanding the granularity spectrum (e.g., week, hour, minute) and dynamically learning optimal temporal fusion weights.
  • Extending RoTE to session-based, group, or multi-modal recommendation contexts.
  • Incorporating more advanced temporal regularization strategies or higher-order temporal interactions within rotary embeddings.

Conclusion

RoTE presents a practical, principled solution to limitations in temporal modeling for sequential recommendation. By decomposing timestamps into hierarchical components and injecting them via multi-level rotary embeddings in Transformers, RoTE enables superior temporal sensitivity and achieves robust performance gains across diverse baselines and datasets. Its lightweight and generic design positions it as a compelling temporal modeling module for future sequential recommendation research and production systems.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 2 tweets with 13 likes about this paper.