Papers
Topics
Authors
Recent
Search
2000 character limit reached

Long Behavior Sequential Recommendation

Updated 10 March 2026
  • Long behavior sequential recommendation is a framework that models extensive user interactions to capture long-term preferences and dynamic intent.
  • It employs advanced architectures like hybrid attention, state space models, and memory-augmented networks to address computational challenges and noise in large-scale data.
  • These techniques enable robust inference by efficiently handling long-range dependencies, multi-intent disentanglement, and heterogeneous behavioral signals.

Long Behavior Sequential Recommendation refers to the modeling, learning, and inference techniques for sequential recommender systems that explicitly capture dependencies, dynamics, and preference signals over extended user interaction histories—often spanning hundreds to tens of thousands of events, potentially in multi-behavior and multi-intent contexts. This field addresses critical challenges in modeling stable and drifting user interests, computational efficiency, memory bottlenecks, noise accumulation, and the disentanglement of diverse behavioral patterns within voluminous long-term data.

1. Fundamental Challenges and Problem Formulation

Long behavior sequential recommendation is defined by the need to predict a user's next or future interactions given an extensive history:

  • User histories: sequences B={b1,b2,,bT}B = \{b_1, b_2, \ldots, b_T\}, with TT potentially in the range 10210410^2 - 10^4 or more.
  • Prediction target: estimate P(bT+1B)P(b_{T+1} \mid B) (or P(bT+1:T+kB)P(\vec{b}_{T+1:T+k} \mid B) for multi-step) using the entire history without truncation or severe information loss (Xin et al., 20 Feb 2026, Huang et al., 26 Jan 2026).

Key challenges include:

2. Architectures for Long-Range Sequential Recommendation

2.1 Linear and Hybrid Attention Mechanisms

  • Hybrid linear-softmax attention: HyTRec explicitly decouples long-term and short-term modeling by assigning massive user history to a parallel linear attention branch (near O(T)O(T)); only the most recent KTK\ll T actions are processed by classical softmax attention for high-resolution immediate intent (Xin et al., 20 Feb 2026).
  • Temporal-Aware Delta Network (TADN): Augments linear attention with time-sensitive gates that upweight recent actions (exponentially decayed), mitigating the tendency of linear mechanisms to lag in adapting to intent drift. The recurrent linear branch evolves its state as

St=St1(Igtβtktkt)+βtvtktS_t = S_{t-1}\cdot(\mathbf{I} - g_t\beta_t k_t k_t^\top) + \beta_t v_t k_t^\top

where gtg_t is a temporally- and content-aware gate (Xin et al., 20 Feb 2026).

  • Rotary-Enhanced Linear Attention (RELA)/GRELA: Uses rotary position encodings within linear attention to achieve strong long-range modeling capacity, supplemented by SiLU-based gating to adaptively fuse global and local preference cues (Hu et al., 16 Jun 2025).
  • Multi-scale/Low-rank Transformers: MBHT deploys low-rank self-attention for efficiency and a multi-scale structure (fine/coarse sub-sequence granularity) to encode behaviors at different temporal resolutions, supporting hundreds of steps per user (Yang et al., 2022).

2.2 State Space Models and Parallelizable Recurrences

  • Structured State Space Duality (SSD4Rec): Leverages bidirectional block-wise state space models (Mamba derivatives), enabling hardware-parallel, linear-time sequence modeling with per-token adaptive dynamics (Qu et al., 2024).
  • Behavior-Dependent Linear Recurrent Units (RecBLR): Implements per-timestep, behavior-conditioned gates modulating memory contribution (αt\alpha_t) and input injection (βt\beta_t), admitting a parallel hardware scan via a custom associative operator for O(logT)O(\log T)-depth forward/backward computation (Liu et al., 2024).
  • HoloMambaRec: Fuses holographic embeddings for compact attribute-item representations with shallow selective SSM blocks for constant-time per-timestep inference and linear overall complexity (Parthasarathy et al., 13 Jan 2026).

2.3 Memory-Augmented and Modular Models

  • Dynamic Memory Networks (DMAN): Segments sequences into windows with per-user, dynamically updated external memory blocks distilled via capsule routing, maintaining explicit abstraction of long-term intent compressed into mTm\ll T slots (Tan et al., 2021).
  • Gated Category-Specific Memory (GatedLongRec): Infers ongoing category-level intent via a gating network and encodes category-specific long-term transitions, conditioning final scoring on a mixture over top-kk gated category branches (Cai et al., 2020).
  • Multi-interest Attention with Incremental Updates (LimaRec): Maintains O(1)O(1)-cost per-update user state via linearized, incremental self-attention and disentangles multiple latent interests for diverse-sequence disambiguation (Wu et al., 2021).

3. Robustness: Noise Decoupling and Multi-Behavior Handling

  • Efficient Behavior Sequence Miner (EBM): END4Rec replaces O(L2)O(L^2) attention with FFT-based frequency-domain mining (O(LlogL)O(L\log L)) and introduces two denoising stages:
    • Hard Noise Eliminator: Token-level masking via Gumbel-softmax masks, removing accidentals or behavior outliers.
    • Soft Noise Filter: Channel-wise frequency-domain filters to isolate stale or decayed interest in dense, mixed-behavior logs (Han et al., 2024).
  • Hypergraph-Based Modeling: MBHT constructs a user-specific hypergraph capturing both semantic and multi-behavior relations, propagating signals across long-range, high-order item co-occurrences (Yang et al., 2022).

4. LLMs and Lifelong Sequence Comprehension

  • Lifelong Sequential Behavior Incomprehension: Pure LLMs struggle when the text prompt context includes long, heterogeneous user histories, even when sequence length is far below their context limit (Lin et al., 2023, Shan et al., 23 Jan 2025).
  • Semantic User Behavior Retrieval (SUBR): ReLLa and ReLLaX address this by replacing the chronological history with the KK most semantically relevant items (as measured via LLM-encoded item vectors and cosine similarity), sharply reducing prompt heterogeneity and improving LLM’s extraction of preference signals (Lin et al., 2023, Shan et al., 23 Jan 2025).
  • Full-Stack Optimization: ReLLaX layers SUBR on data, soft prompt augmentation (SPA) at the prompt level (injecting collaborative signals as soft tokens), and a Component Fully-interactive LoRA (CFLoRA) parameter adaptation enabling maximally expressive, per-sample adaptation within the LLM [250

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Long Behavior Sequential Recommendation.