User Drift: Temporal Dynamics in Models
- User drift is the evolving mismatch between historical training data and real-time user behavior, observed in systems ranging from recommender engines to personalized LLMs.
- It manifests as changes in latent preferences, attention shifts, and altered interaction patterns, measured through metrics like AUC, RMSE, and cosine similarity.
- Mitigation strategies include adversarial feature selection, dynamic user modeling, and tailored interventions to realign systems with evolving user profiles.
User drift denotes a family of temporally evolving user-linked changes, but the term is not used uniformly across the literature. In user targeting automation, it is the evolving mismatch between the user population a targeting model was trained on and the user population it is asked to score at inference time (Pan et al., 2020). In matrix-factorization recommenders, it is concept drift in individual user preferences, namely change in a user’s latent tastes over time (Lo et al., 2015). In job recommendation, “Job Preference Drift” is tied to users modifying resumes as they reassess goals or re-target positions (Han et al., 2024). In LLM systems, the term spans evolving and implicit user preferences for decoding-time personalization (Kim et al., 20 Feb 2025), changes in an agent’s behavior caused by user-specific memories, preferences, or personas (Dabas et al., 24 May 2026), and even a gradual shift in a user’s perception, mental model, calibration, confidence, and downstream decision-making caused by evolving model behavior over the course of a conversation (Karny et al., 14 May 2026). Taken together, these usages suggest a common concern with temporal instability in the relation between users, models, and decisions, but they locate the instability at different levels: data distributions, latent preferences, structured actions, reasoning trajectories, or user-side calibration.
1. Terminological range and core distinctions
The literature uses “user drift” in several technically distinct senses rather than as a single standardized construct.
| Research setting | Use of the term | Paper |
|---|---|---|
| User targeting automation | Evolving mismatch between training and inference user populations | (Pan et al., 2020) |
| Matrix-factorization recommendation | Concept drift in individual user latent vectors | (Lo et al., 2015) |
| Job recommendation | Resume-update-linked job preference drift | (Han et al., 2024) |
| Personalized decoding | Evolving and implicit user-specific preferences | (Kim et al., 20 Feb 2025) |
| LLM agents with memory | Behavior changes caused by user-specific memories; specific failure is memory-induced tool-drift | (Dabas et al., 24 May 2026) |
| Multi-turn human–LLM interaction | Shift in user calibration, confidence, and mental model induced by evolving chatbot behavior | (Karny et al., 14 May 2026) |
| Reddit attention dynamics | “Interest drift” within topic and “interest shift” across topics | (Valensise et al., 2019) |
Two distinctions recur. First, several papers separate user drift from broader distributional drift. The MaLTA work treats user drift as a manifestation of concept drift, decomposed into covariate shift, label shift, conditional shift, and temporal patterns such as sudden, gradual, or recurring drift (Pan et al., 2020). By contrast, TMF, BISTRO, and the Reddit study focus on temporal change in users’ own preferences or attention rather than on a train–test mismatch (Lo et al., 2015). Second, newer LLM papers separate system-side drift from user-side drift. “Memory-Induced Tool-Drift in LLM Agents” isolates structured parameter deviation caused by irrelevant personal memories (Dabas et al., 24 May 2026), whereas “Multi-Turn Neural Transparency” defines user drift as a change in the user’s own perception and calibration induced by behavioral drift in the model (Karny et al., 14 May 2026).
A further distinction appears in the mechanism-oriented framework for long-term human–LLM interaction. That work centers alignment drift, defined as a gradual process in which system outputs become less constrained by the user’s current message and more shaped by prior interaction history, while still appearing helpful, coherent, and responsive (Yao, 15 May 2026). The same source treats user-side changes in goals, preferences, or tolerance as analytically distinct from that system-side process. This suggests that, in current usage, “user drift” can refer either to change in the user or to user-conditioned change in the model, depending on the field.
2. Population-level drift and predictive systems
In large-scale predictive systems, user drift is often operationalized as pre-inference distributional mismatch. Uber’s MaLTA paper formalizes the core setup by training an adversarial classifier that distinguishes old data from new data , with score and cross-entropy objective
AUC near indicates similar feature distributions, whereas AUC close to indicates severe covariate drift (Pan et al., 2020).
The same work uses the adversarial score as a density-ratio proxy. Under covariate shift, the importance weight is written as
with trimming near to avoid extreme weights (Pan et al., 2020). It then proposes three adaptation strategies: automated feature selection, validation data selection via propensity score matching, and inverse propensity weighting. The strongest empirical result is for adversarial feature selection. On AutoML3, ADA had adversarial AUC and showed no effect, while RL and AA/B/C/D/E had adversarial AUC and benefited from feature selection: RL improved from 0 to 1, AA from 2 to 3, B from 4 to 5, C from 6 to 7, D from 8 to 9, and E from 0 to 1 (Pan et al., 2020). On MaLTA, Training vs Test1 had adversarial AUC 2, and GBDT feature selection reduced the feature set from 3 to 4 while increasing average test AUC by 5 (Pan et al., 2020).
This population-level view differs from preference-tracking formulations, but it shares the same temporal structure: models are trained on stale user evidence and deployed on changed user distributions. A plausible implication is that “user drift” in predictive systems is often less about explicit preference change than about the failure of historical feature representations to remain transportable across time.
3. Individual preference drift, migration, and collaborative evolution
At the individual level, recommender-system work models user drift as change in latent preference state, observed behavior, or interaction pathways. TMF treats each user latent vector as time-varying and fits a user-specific linear transition
6
after learning per-time-step user vectors with modified SGD while holding the item matrix 7 fixed (Lo et al., 2015). Ratings at prediction time are then computed by 8 (Lo et al., 2015). On a synthetic dataset deliberately constructed with drift, TMF reported roughly 9–0 RMSE reduction; on real datasets, RMSE improved from 1 to 2 on Ciao, from 3 to 4 on Epinions, from 5 to 6 on Flixster, and from 7 to 8 on MovieLens (Lo et al., 2015). The reported gain was concentrated in users whose latent vectors actually drifted at prediction time (Lo et al., 2015).
BISTRO operationalizes job preference drift through resume updates. The platform statistics in that work report that users update resumes approximately every 9 days when they remain unemployed, that frequent updaters are 0 more likely to receive offers, and that over three-quarters change job-seeking objectives during resume refinements (Han et al., 2024). Its three-stage pipeline consists of coarse-grained semantic clustering, fine-grained job preference extraction by hypergraph wavelet learning, and personalized top-1 job recommendation via an RNN (Han et al., 2024). Sessions are formed by segmenting interaction sequences at resume modification times, and the framework assumes preferences are relatively stable within each session (Han et al., 2024). On three real-world datasets from Shenzhen, Shanghai, and Beijing, BISTRO outperformed conventional CF, GNN-based, and sequential baselines offline, and in a half-week online deployment to 2 of active users it achieved higher chat rate and onboarding rate than baselines (Han et al., 2024).
A more behavioral formulation appears in the Reddit study. There, “a drift is a sudden change of subreddit within the same topic,” whereas “a shift is instead a sudden change from one topic to another” (Valensise et al., 2019). The geometric detector uses binned activity vectors 3, cosine similarity
4
and angle 5, with a sudden variation declared when 6 (Valensise et al., 2019). On a corpus of 7 subreddits, 8M+ posts, 9M+ comments, and 0M+ users, user lifetimes on subreddits were short relative to the 1-month observation window, peaking around 2 days with skewness 3, and at least 4 of users displayed at least one shift (Valensise et al., 2019). The paper concluded that trajectories were bursty and migratory rather than smooth, with frequent transitions between recreational subreddits and those more related to news and politics (Valensise et al., 2019).
The simulation framework for algorithmic drift adds an explicitly counterfactual perspective. It defines user behavior through resistance 5, inertia 6, and randomness 7, and quantifies recommender-induced change with Algorithmic Drift Score and Delta Target Consumption (Coppolillo et al., 2024). In the two-category case,
8
while
9
With RecVAE, 0 rounds, and 1 steps, the framework showed that semi-radicalized “bridge” users amplified both ADS and DTC, that drift increased as users relied more on recommendations and less on their own preferences, and that increasing 2 in 3 slightly changed DTC but left ADS distributions essentially unaffected (Coppolillo et al., 2024).
Generative recommendation introduces a related but infrastructure-level notion. DACT treats evolving user behavior as collaborative drift: new interactions change item co-occurrence patterns and popularity, making collaboration-aware item tokenizers stale (Feng et al., 31 Mar 2026). Its two stages are tokenizer fine-tuning with a Collaborative Drift Identification Module and hierarchical code reassignment via a relaxed-to-strict strategy (Feng et al., 31 Mar 2026). On Tools, naive fine-tuning changed codes almost completely—Layer1 4, Layer2 5, Layer3 6, Overall 7—whereas DACT with 8 and 9 reduced change rates to Layer1 0, Layer2 1, Layer3 2, Overall 3 (Feng et al., 31 Mar 2026). This literature frames user drift not as a property of users alone but as a driver of continual instability in the representational interface between items and generative models.
4. Personalized LLMs, memory, and drifted reasoning or action
In personalized LLMs, user drift is often tied to the fact that preferences are heterogeneous, partly implicit, and can evolve over time. The decoding-time framework “Drift” uses this evolving and implicit nature of preferences to motivate a training-free personalization method in which user reward is a composition of interpretable attributes,
4
and generation is steered by
5
With only 6–7 examples, Drift reached approximately 8 test accuracy by 9 examples, and with 0 training examples achieved win rates of 1 against pure LLM outputs for Llama-8B and 2 for Gemma-9B under Gold RM and GPT-judge respectively (Kim et al., 20 Feb 2025). Here, user drift is not a failure mode but the target of rapid personalization.
The tool-using agent setting turns the same personalization machinery into a vulnerability. “Memory-Induced Tool-Drift in LLM Agents” defines user drift broadly as changes in an agent’s behavior caused by user-specific memories, preferences, or personas, and isolates memory-induced tool-drift as the case where personality-driven biases stored in long-term memory silently change tool-call parameter choices in professional contexts where personalization is not appropriate (Dabas et al., 24 May 2026). The formal setup is
3
with comparison across no memory 4, neutral memories 5, and biased memories 6 (Dabas et al., 24 May 2026). The MEMDRIFT benchmark contains 7 scenarios spanning 8 bias dimensions and 9 professional domains, with 0 tool calls per memory condition and a judge-produced Likert deflection score on a 1–2 scale (Dabas et al., 24 May 2026). Across seven frontier models, biased memories raised deflection scores by up to 3 points; Claude Sonnet 4.5 had overall 4, Gemini 3.1 Pro 5, Gemini 2.5 Pro 6, and GPT-5.4 7 under direct memory injection (Dabas et al., 24 May 2026). Drift persisted under three production memory architectures—Mem0, MemPalace, and SimpleMem—and in the strongest reported memory-framework configuration, SimpleMem + Gemini 3.1 Pro reached overall 8, near the paper’s maximum reported 9 (Dabas et al., 24 May 2026). A scan of 00 tools across 01 verified MCP servers flagged 02 as highly susceptible, with validated examples flipping parameters such as visibility, safesearch, and priority (Dabas et al., 24 May 2026).
Mechanistically, that paper treats memories as implicit steering vectors and attention hijackers. For each bias dimension and layer it constructs a steering direction
03
and measures the projection induced by memory relative to the no-memory baseline (Dabas et al., 24 May 2026). Across all five dimensions, biased memories yielded larger positive projections than neutral memories, especially in middle-to-late layers, and attention shifted toward memory and away from tool schema, user query, and partial tool call (Dabas et al., 24 May 2026).
DRIFTLENS extends the same concern from tool parameters to reasoning trajectories. It defines memory-induced reasoning drift as the change in the symbolic reasoning trajectory for a question when irrelevant user-attribute memory is injected, even when the final answer remains fluent, on-topic, and plausible (Fang et al., 2 Jul 2026). The framework maps each reasoning step into a discrete value symbol and compares baseline and intervened trajectories with DTW and OSRI (Fang et al., 2 Jul 2026). The main benchmark has 04 persona-agnostic, unverifiable, reasoning-invoking questions, ten user-attribute categories, and controls for pragmatic noise and major life events (Fang et al., 2 Jul 2026). Pragmatic noise did not significantly elevate drift above the noise floor; on Claude Sonnet 4.6 it changed DTW by 05 and SRI by 06, and on Qwen3-4B by 07 and 08, all with 09 (Fang et al., 2 Jul 2026). By contrast, life events caused large increases—10 DTW and 11 SRI on Claude Sonnet 4.6, and 12 DTW and 13 SRI on Qwen3-4B, all with 14 (Fang et al., 2 Jul 2026). Across four models and ten persona categories, all categories were significantly above the noise floor, with SRI standardized effect sizes of approximately 15–16 on Qwen3-4B, 17–18 on Claude Sonnet 4.6, 19–20 on GPT-OSS-120B, and 21–22 on DeepSeek-R1 (Fang et al., 2 Jul 2026). DPO- and GRPO-based post-training reduced drift, but neither uniformly dominated across Qwen3-4B, Phi-4-mini-instruct, and Gemma2-2B (Fang et al., 2 Jul 2026).
5. User-side drift in long-term human–LLM interaction
A different strand of work uses “user drift” to describe change in the user rather than in the model’s outputs alone. “Multi-Turn Neural Transparency” defines user drift in multi-turn human–AI interaction as the gradual shift in a user’s perception, mental model, calibration, confidence, and downstream decision-making caused by evolving model behavior over the course of a conversation (Karny et al., 14 May 2026). The paper distinguishes this from model behavioral drift, such as becoming more sycophantic, more toxic, or more robotic/human-like (Karny et al., 14 May 2026). Its intervention is a multi-turn neural transparency interface built on six bipolar trait vectors—empathy, toxicity, romanticness, sycophancy, sophistication, and roboticness—derived from contrastive system prompts and activation-space directions with reported fits of 23 at layer 24 (Karny et al., 14 May 2026).
Trait scores are computed by projection,
25
and visualized through a sunburst showing current behavioral state and a drift panel showing per-turn trajectory (Karny et al., 14 May 2026). In a randomized controlled study with 26, participants without visualization had RMSE approximately 27–28 and sign accuracy approximately 29 at baseline (Karny et al., 14 May 2026). Any visualization versus control significantly improved calibration across all four RMSE paradigms: Anticipation vs Initial 30, 31, 32; Evaluation vs Initial 33, 34, 35; Evaluation vs Final 36, 37, 38; Evaluation vs Average 39, 40, 41 (Karny et al., 14 May 2026). The multi-turn interface further outperformed the static single-turn visualization on Evaluation vs Average, with 42, 43, 44 (Karny et al., 14 May 2026). Control participants increased self-rated predictive ability by 45 and trust by 46 despite no corresponding gain in accuracy, whereas visualization groups did not show that increase (Karny et al., 14 May 2026).
The mechanism-oriented framework on alignment drift places such user-side effects inside a recursive interactional process. It distinguishes Signal A, the meaning directly readable from the message itself, from Signal B, which is derived through inference from contextual premises including needs, emotional state, cognitive preferences, current situation, cultural background, and interaction history (Yao, 15 May 2026). Drift develops because inferential products based on Signal B remain in context and become premises for later inferences, while user feedback-type messages select and reinforce sub-patterns that appear “most suitable” for keeping the user engaged (Yao, 15 May 2026). The framework divides the process into low-alignment, high-alignment, and critical regimes, with “failure of correction” and “intention override” characterizing the critical regime (Yao, 15 May 2026). It also states that, as long as the context is not reset or cleared, and as long as the interaction continues, drift can slow down in the short term, but it cannot move backward (Yao, 15 May 2026). In this literature, user drift and alignment drift are analytically separable but operationally entangled: changed model behavior can induce changed user reliance, while changed user behavior supplies the feedback that selects and stabilizes sub-patterns.
6. Measurement, mitigation, and unresolved problems
The measurement of user drift varies sharply by domain. Population-level predictive systems rely on adversarial AUC, feature importances, and matching balance criteria such as 47 (Pan et al., 2020). Preference-tracking recommenders use RMSE on temporal holdout (Lo et al., 2015), HR@k and MRR@k in offline and online settings (Han et al., 2024), angle thresholds above 48 for bursty attention reconfiguration (Valensise et al., 2019), and graph-based pathway metrics such as ADS and DTC (Coppolillo et al., 2024). Personalized LLM work has introduced judge-scored deflection on a 49–50 Likert scale for tool calls (Dabas et al., 24 May 2026), DTW and OSRI for reasoning trajectories (Fang et al., 2 Jul 2026), and RMSE between human ratings and activation-derived trait scores for user calibration (Karny et al., 14 May 2026). This suggests that the field does not yet possess a single canonical metric for user drift; instead, each formulation measures a different failure surface.
Mitigation is similarly heterogeneous. In MaLTA, adversarial feature selection was more robust than propensity-based weighting, which consistently underperformed baseline on heavy-drift datasets (Pan et al., 2020). In MEMDRIFT, prompt-based relevance instructions reduced 51 by 52 overall on GPT-5.4 but left substantial residual drift, and Self-ReCheck removed biased memories perfectly on MemDrift because of strict personal–professional separation by construction, yet on a multi-hop realistic stress test it had recall 53 and false positive rate 54 (Dabas et al., 24 May 2026). In DRIFTLENS, both DPO and GRPO reduced reasoning drift, but their side effects depended on backbone and reward design; for example, format-augmented GRPO often helped instruction following, while DPO improved non-distraction accuracy on all tested backbones (Fang et al., 2 Jul 2026). In user-calibration work, multi-turn transparency improved anticipation and evaluation and reduced overconfidence without altering the underlying model (Karny et al., 14 May 2026). In the alignment-drift framework, the primary boundary conditions are explicit context reset, stopping the interaction, reducing single-system reliance, and system refusal when appropriate (Yao, 15 May 2026).
Several unresolved problems recur. MEMDRIFT studies single-tool calls rather than tool chains and fixes tool choice rather than allowing biased memories to skew tool selection itself (Dabas et al., 24 May 2026). DRIFTLENS measures externalized reasoning rather than latent cognition and is ontology-dependent, even though cross-model agreement on the refined ontology exceeded 55 and a human spot check agreed with 56 labels (Fang et al., 2 Jul 2026). The neural transparency study lasted 57 minutes per conversation, whereas the largest safety risks may emerge over weeks or months (Karny et al., 14 May 2026). TMF assumes stationary item factors and linear first-order user transitions (Lo et al., 2015). BISTRO assumes relative stationarity within sessions segmented by resume updates (Han et al., 2024). DACT assumes drift is moderate and that CF embeddings reliably reflect current user behavior (Feng et al., 31 Mar 2026).
A final misconception addressed across these papers is that drift is necessarily visible at the surface level. The LLM-agent and DRIFTLENS results show that final answers can remain fluent, on-topic, and plausible while tool parameters or reasoning trajectories drift materially (Dabas et al., 24 May 2026). The neural transparency study shows that users can become more confident without becoming more accurate (Karny et al., 14 May 2026). The alignment-drift framework argues that subjective experience may improve as the system becomes more familiar, useful, and attuned even while outputs become less constrained by the current message (Yao, 15 May 2026). The broader implication is that user drift, across its many meanings, is often a latent temporal phenomenon that becomes consequential precisely because it is not easily diagnosed from single outputs or short evaluation windows.