Agent4Rec: LLM-Powered User Simulation for RS

Updated 21 January 2026

Agent4Rec is a framework of LLM-enabled user simulators that emulate real user behaviors with dynamic profiles, memory, and emotion-driven actions.
The architecture integrates profile, memory, and action modules with precise mathematical formulations to mimic taste- and emotion-based decision making.
Agent4Rec enables rigorous evaluation of recommender systems by assessing alignment, filter bubbles, and causal relationships in controlled simulation environments.

Agent4Rec designates a class of LLM-empowered generative user simulators designed for rigorous, realistic, and multi-faceted evaluation of recommender systems (RS). Unlike conventional reward-driven agentic recommenders, Agent4Rec is instantiated as a simulated population of user agents, each equipped with distinct profiles, dynamic memory, and a repertoire of taste-driven and emotion-driven actions. These agents interact page-by-page with personalized recommendation environments—typically under collaborative filtering algorithms—yielding both quantitative and qualitative feedback. Agent4Rec enables principled study of alignment between simulated and real user preferences, investigation of phenomena such as filter bubbles and causal relationships, and systematic evaluation of RS in controlled, reproducible settings (Zhang et al., 2023, Peng et al., 14 Feb 2025).

1. System Architecture: Agent Modules and Workflow

Agent4Rec comprises two major subsystems: a fleet of generative user agents and a recommendation environment with plug-and-play RS algorithms. Each user agent incorporates three modules:

Profile Module: Aggregates social traits and preference summaries using real-world data (e.g., MovieLens, Steam, Amazon-Book). Social metrics include activity, conformity, and diversity, segmented into quantiles (low/medium/high) and supplemented by a taste summary generated via LLM prompting over historical data.
Memory Module: Logs both factual (item interactions, ratings) and emotional (sentiment and satisfaction embeddings) memories, updated per interaction and via periodic emotion-driven reflection notes every $K$ steps.
Action Module: Selects actions per item slate using a softmax policy blending taste-matching and emotion bias, plus session-level actions (exit/continue/interview) modulated by satisfaction, fatigue, and social traits.

The agent-environment interaction proceeds as follows: initialization of user profile from dataset, sequential page presentations with RS-generated item slates, action selection (view, rate, skip, comment), memory update, emotion-driven reflection, and loop termination based on user fatigue or satisfaction. Each agent's observations and actions can be used to retrain or enhance the RS (Zhang et al., 2023).

2. Mathematical Formulation and Simulation Protocol

Agent4Rec formalizes user simulation along the following lines:

Profile Construction: Social metric computation, e.g., activity $T_{\mathrm{act}}^u = \sum_{i} y_{ui}$ , conformity $T_{\mathrm{conf}}^u = 1/\sum_{i} y_{ui}|r_{ui}-R_i|^2$ , diversity $T_{\mathrm{div}}^u = |\bigcup_{i : y_{ui}=1} G_i|$ .
Memory Management: Factual $M^{(t)}_{\mathrm{fact}}$ and emotional $M^{(t)}_{\mathrm{emo}}$ memories, updated per episode and via LLM-driven chain-of-thought and reflection summaries.
Action Selection: Taste-driven actions per item $i$ with score $s_{u,i} = \cos(\phi_u, \psi_i) + \lambda \langle u_{\mathrm{bias}}, e_t \rangle$ , where $\phi_u$ and $\psi_i$ are agent/item embeddings, and $T_{\mathrm{act}}^u = \sum_{i} y_{ui}$ 0 encodes recent affective state.
Session Termination: Exit probability as $T_{\mathrm{act}}^u = \sum_{i} y_{ui}$ 1.

No backpropagation is performed inside agents; hyperparameters are tuned via grid search for alignment with empirical user statistics. Agents can be scaled concurrently, each seeded from distinct user data, offering a population-level simulation of recommender interactions (Zhang et al., 2023, Peng et al., 14 Feb 2025).

3. Evaluation Metrics and Alignment Studies

Agent4Rec is systematically evaluated on large-scale datasets, e.g., MovieLens-1M, Amazon Books, and Steam Games. Key metrics include:

Preference Discrimination: Accuracy, Precision, Recall, F1 for predicting “liked” vs “unseen” items.
Rating Distribution Alignment: KL-divergence of simulated vs real-user rating histograms.
Social Trait Cohorts: Analysis of action selection and rating patterns across population strata (activity, conformity, diversity).
Recommender Strategy Response: Comparative metrics ( $T_{\mathrm{act}}^u = \sum_{i} y_{ui}$ 2, $T_{\mathrm{act}}^u = \sum_{i} y_{ui}$ 3, $T_{\mathrm{act}}^u = \sum_{i} y_{ui}$ 4) under different RS algorithms (MF, MultVAE, LightGCN).
Filter Bubble Emulation: Iterative retraining and monitoring of diversity metrics ( $T_{\mathrm{act}}^u = \sum_{i} y_{ui}$ 5, $T_{\mathrm{act}}^u = \sum_{i} y_{ui}$ 6).
Causal Discovery: Structural equation modeling (DirectLiNGAM) over variables such as item quality, popularity, exposure, and agent rating, disclosing causal amplification (popularity bias).

Through these metrics, Agent4Rec agents demonstrate micro-level alignment with individual preferences, macro-level reproduction of rating distributions, and the capacity to simulate emergent phenomena, supporting robust benchmarking and analysis (Zhang et al., 2023).

Agent4Rec fits the "simulation-oriented" paradigm in the taxonomy of LLM-powered agents for RS, as elaborated in recent surveys (Peng et al., 14 Feb 2025). Contrasted to:

Recommender-Oriented Agents: Directly optimize top-K recommendation (MACRec, Collab-REC) via distributed LLM agent collaboration (Wang et al., 2024, Banerjee et al., 20 Aug 2025).
Interaction-Oriented Agents: Model multi-turn recommendation as MDP/dialogue (AgentRec, Rec4Agentverse) with dynamic preference adaptation and explainability (Ma et al., 2 Oct 2025, Zhang et al., 2024).
Simulation-Oriented Agents: Instantiate populations of user agents whose behavioral fidelity supports evaluation, ablation, and fairness analysis (Peng et al., 14 Feb 2025, Zhang et al., 2023, Liu et al., 12 Sep 2025).

Agent4Rec specifically advances the state of user simulation by combining structured behavioral modeling, emotion-driven state evolution, chain-of-thought reflection, and integration with page-by-page recommendation environments. It enables experimental protocols—A/B testing, causal inference—previously inaccessible in offline metrics-only RS evaluation.

5. Systemic Challenges, Limitations, and Future Outlook

Multiple challenges attend Agent4Rec deployments:

Computational Cost: Per-step LLM invocation is expensive; scaling to industrial-sized datasets requires hybrid architectures and caching strategies.
Explicit Planning: Absence of a formal Planning module may limit simulation of long-horizon user strategies and strategic exploration.
Evaluation Standards: Lack of standardized user-simulator benchmarks; subjective believability remains the main metric for simulation naturalism (Peng et al., 14 Feb 2025).
Robustness and Security: Vulnerability to prompt injections and adversarial behaviors necessitates integration of anomaly detection in the memory/planning loop.
Realism Gap: Ensuring the simulated agent population accurately reflects diverse real user cohorts, including social network and exogenous factors.

Promising directions include multi-agent co-simulation (simulating user cohorts and network effects) (Liu et al., 12 Sep 2025), adversarial defenses, unifying evaluation suites, hybrid models with RL finetuning, and deep integration with both recommender and agent platforms for continuous system improvement (Zhang et al., 2023, Peng et al., 14 Feb 2025, Zhang et al., 2024).

6. Historical Evolution, Applications, and Impact

Agent4Rec reflects the convergence of LLM-powered simulation, agentic decision-making, and recommender benchmarks, as pioneered by Zhang et al. (Zhang et al., 2023). Its architectural lineage draws from multi-agent collaboration frameworks (MACRec, Collab-REC), agent-dispatcher designs (AgentRec), and agent-item architectures (Rec4Agentverse). Applications extend from recommendation algorithm evaluation, filter bubble and causal analysis, to fairness and robustness studies. Its open-source instantiation (https://github.com/LehengTHU/Agent4Rec) has informed best practices in reproducible RS simulation, population-level diversity modeling, and integration of explainability and emotional intelligence in user-interfacing agents.

Agent4Rec’s modular and extensible schema positions it as a reference simulator for the next generation of agentic recommender systems, supporting principled algorithmic comparison and deep analyses of interactive human–RS dynamics (Peng et al., 14 Feb 2025, Zhang et al., 2023).