User-Centric Agent: Design & Evaluation
- User-centric agents are intelligent systems that prioritize evolving user needs and preferences, ensuring transparency, control, and user-defined satisfaction.
- They employ modular architectures with proactive intent recognition, dynamic context folding, and personalized memory augmentation to support adaptive interactions.
- Evaluation metrics focus on user-specific outcomes, such as success rates and personalization scores, while addressing challenges in scalability, privacy, and governance.
A user-centric agent is an intelligent system whose design, operation, and success criteria are fundamentally anchored in the evolving needs, preferences, routines, and control of the user. Unlike platform-centric or agent-centric agents, which optimize for system-driven objectives or agent behaviors, user-centric agents treat the user as both principal and partner: they align their policies to maximize utility as defined by user-specified goals, support proactive and adaptive interaction, and maintain transparency, privacy, and user autonomy as first-class principles (Eskenazi et al., 2019, Zhang et al., 17 Feb 2026). This entry synthesizes architectures, algorithms, empirical evaluation, and open challenges from major recent works to present a comprehensive survey of user-centric agent design and practice.
1. Conceptual Foundations and Historical Context
The user-centric paradigm explicitly contrasts with agent-centric and platform-centric models. Agent-centric systems have historically focused on metrics like task completion, slot accuracy, or Turing Test performance—judging success by the agent’s internal objectives or the ability to mimic humanness (Eskenazi et al., 2019). Platform-centric services, meanwhile, optimize for provider KPIs such as engagement, retention, and conversion, often diverging from user welfare (Zhang et al., 17 Feb 2026). In user-centric architectures, the agent operates as a partner: the primary metric of success is directly rooted in the user’s satisfaction, task accomplishment, and continued engagement.
Formally, a user-centric agent can be defined as maximizing
where denotes the user’s context, her goals, and her constraints (Zhang et al., 17 Feb 2026).
Critical to this orientation are (i) continuous adaptation to heterogeneous user needs, (ii) proactive identification and resolution of underspecified or latent intent, and (iii) respecting agency, privacy, and explicit user overrides (Qian et al., 29 Jul 2025, Eskenazi et al., 2019, Zhang et al., 17 Feb 2026).
2. Architectural and Algorithmic Design Patterns
User-centric agents are instantiated in diverse domains but share a set of foundational architectural principles:
- Modular Multi-Agent Orchestration: Separation between information management, decision/action, and reflection/feedback, often organized around central and distributed modules interacting via shared memory (Li et al., 22 Oct 2025, Jia et al., 9 Oct 2025, Saleh et al., 1 May 2025).
- Intent Recognition and Proactivity: Policy modules that can transition between reactive (instruction-following) and proactive (need-anticipating) modes, typically through complexity/proactivity scores or explicit intent alignment metrics (Zhao et al., 26 Aug 2025, Lyu et al., 14 Jan 2026, Yang et al., 20 May 2025).
- Multi-Domain and Multi-Modal Integration: Handling diverse data, sensors, interfaces, and sources (text, vision, audio, GUI, knowledge bases), with multimodal context encoding and memory (Yang et al., 20 May 2025, Li et al., 22 Oct 2025, Lyu et al., 14 Jan 2026).
- Personalization and Memory Augmentation: Persistent, structured user profiles capturing habits, preferences, and routines, updated continuously and used to condition all retrieval/generation (Chen et al., 2024, Lyu et al., 14 Jan 2026, Saleh et al., 1 May 2025, Zerhoudi et al., 2024).
- Human-in-the-Loop Interaction: Explicit support for interruption, correction, clarification, and override by user at all stages; interface co-design to ensure transparency and control (Hua et al., 2024, Jia et al., 9 Oct 2025, Eskenazi et al., 2019).
Illustrative Agent System Component Table:
| Module | Role | Example Implementations |
|---|---|---|
| Intent Recognizer | Detect user aim, resolve ambiguity | Proactivity scoring (Zhao et al., 26 Aug 2025), RL Gym (Qian et al., 24 Sep 2025) |
| Memory & Profile | Store long-term preferences/routines | Apollonion (Chen et al., 2024), HIM-Agent (Lyu et al., 14 Jan 2026) |
| Decision/Execution | Action selection, tool use, policy | Task Orchestrator (Li et al., 22 Oct 2025), DmA (Jia et al., 9 Oct 2025) |
| Feedback/Reflection | Evaluate outcome, detect misalignment | RA (Jia et al., 9 Oct 2025), Reflection (Li et al., 22 Oct 2025) |
| UI/Interface | User control, transparency, override | Streamlit 3-pane UI (Zhao et al., 26 Aug 2025), ISP UI (Hua et al., 2024) |
Key algorithms include dynamic context folding (U-Fold (Su et al., 26 Jan 2026)), hierarchical memory management (HIM-Agent (Lyu et al., 14 Jan 2026)), process-oriented RL for multi-turn preference discovery (UserRL (Qian et al., 24 Sep 2025)), and Value-of-Information-driven orchestration (UserCentrix (Saleh et al., 1 May 2025)).
3. Personalization and Long-term User Alignment
Personalization is achieved via explicit user profiles, histories of interaction, and memory-augmented reasoning. Agents such as HIM-Agent (Lyu et al., 14 Jan 2026) maintain persistent, hierarchically organized personal memories to resolve omitted preferences (“Buy my usual coffee”) and anticipate routines for proactive suggestions (“Check weather at 7am”). AppAgent-Pro (Zhao et al., 26 Aug 2025) and ColorAgent (Li et al., 22 Oct 2025) decompose vague queries into sub-tasks or extract user-specific Standard Operating Procedures (SOPs).
Evaluation metrics quantify how well agents resolve user-tailored outcomes:
| Metric | Description | Cited System |
|---|---|---|
| Step-wise Success Rate | Fraction of correct step execution in GUI tasks | HIM-Agent (Lyu et al., 14 Jan 2026) |
| Cumulative Error Rate | Weighted sum of step failures under vagueness | HIM-Agent (Lyu et al., 14 Jan 2026) |
| Personalization Score | Embedding similarity between profile and response | Apollonion (Chen et al., 2024) |
| MobileIAR | Intention Alignment Rate between action/user label | ColorAgent (Li et al., 22 Oct 2025) |
Systems that leverage long-term, structured user records and routines exhibit significant improvements in both execution and proactivity metrics. For instance, HIM-Agent yields 15.7% and 7.3% gains in execution and proactive identification, respectively, over standard baselines (Lyu et al., 14 Jan 2026).
4. Proactive Interaction, Collaboration, and Human Control
User-centric agents are distinguished by their ability to act without explicit command, yet only within boundaries that respect user control and preference. Proactivity is mediated by:
- Scoring models such as Complexity/Proactivity thresholds (Zhao et al., 26 Aug 2025) and multi-modal context-prediction (Yang et al., 20 May 2025).
- User clarification and escalation: Conflict-aware planners escalate for human input when ambiguity, contradiction, or sensitive steps arise (Jia et al., 9 Oct 2025, Li et al., 22 Oct 2025).
- Proactive tool use: Agents dynamically call external services, propose options, and integrate multiple modalities to provide contextually rich support (Yang et al., 20 May 2025).
- Feedback, audit, and override: System architectures expose reasoning logs, subtask lists, and status to the user, facilitating trust and transparency (Zhao et al., 26 Aug 2025, Hua et al., 2024).
Failure to tune the balance between autonomy and user oversight can lead to confusion or loss of trust (Zhao et al., 26 Aug 2025).
5. Benchmarking, Evaluation, and Performance
High-fidelity evaluation of user-centric agents requires domain-adaptive, multi-turn, preference-driven testbeds. UserBench (Qian et al., 29 Jul 2025), VitaBench, and AndroidIntent (Lyu et al., 14 Jan 2026) each simulate evolving, underspecified objectives and incremental preference revelation. Findings across these benchmarks include:
- Preference revelation is poor in current agents: On UserBench, even leading LLM agents elicit less than 30% of user preferences through proactive collaboration. Full alignment with all user goals is observed in only ~20% of cases (Qian et al., 29 Jul 2025).
- Context management is essential: Dynamic intent-aware context folding outperforms static methods, especially in long, noisy, multi-turn tasks (U-Fold achieves up to 71.4% win rate on VitaBench) (Su et al., 26 Jan 2026).
- RL reward shaping and trajectory scoring impact alignment: Dense, turn-wise feedback and reward-to-go aggregation enable more effective, adaptive multi-turn interactions (Qian et al., 24 Sep 2025).
- Process orientation over outcome-only optimization: Agents that treat the user’s workflow as a stateful process, rather than a single-shot goal, align better with evolving intent and feedback (Narechania et al., 28 Jun 2025).
6. Privacy, Autonomy, and Governance
Fundamental to user-centricity is rigorous respect for privacy, autonomy, and trust boundaries. Recent works propose:
- Local ownership of data and models: Agents run on-device or in private cloud, memory and profile data are encrypted under user-held keys, with explicit revocation and forgetting mechanisms (Carbery et al., 5 Oct 2025, Zhang et al., 17 Feb 2026).
- Policy and constraint enforcement: Agents expose guardrail policies and check conformance for all tool/API calls, supporting user overrides at granular levels (Zhang et al., 17 Feb 2026, Li et al., 22 Oct 2025).
- Interoperability and open governance: Credentialing, communication standards (e.g., verifiable DIDs, clearinghouse APIs), and regime-neutral compliance are required to prevent anticompetitive lock-in and deepen agent advocate roles as fiduciaries (Carbery et al., 5 Oct 2025, Zhang et al., 17 Feb 2026).
- User agency in engagement: Design patterns prioritize transparency (real-time logs, explainable planning), direct user feedback (NUF, session-level satisfaction), and portable memory formats (Eskenazi et al., 2019, Zhang et al., 17 Feb 2026).
7. Open Challenges and Research Directions
Open questions and technical challenges include:
- Scalability and efficiency: Managing context/long-term memory for multi-session, multi-user environments with fixed resource budgets (Su et al., 26 Jan 2026, Saleh et al., 1 May 2025).
- Reliable alignment: Formalizing in various domains and preventing reward/specification gaming (Carbery et al., 5 Oct 2025, Zhang et al., 17 Feb 2026).
- Privacy-preserving computation: Differential privacy schemes for tiered memories, on-device incremental learning, and federated model updates (Narechania et al., 28 Jun 2025, Zhang et al., 17 Feb 2026).
- Cold-start and generalization: Bootstrapping personalization for new users with sparse data, and cross-device or cross-application state synchronization (Lyu et al., 14 Jan 2026, Chen et al., 2024).
- Ecosystem and governance: Establishing open, competitive markets for agent modules, enforcing data portability, and designing audit-friendly, user-aligned protocols (Carbery et al., 5 Oct 2025, Zhang et al., 17 Feb 2026).
- Evaluation: Developing benchmarks and metrics that go beyond task success to capture depth of personalization, user satisfaction, routine alignment, and transparency (Qian et al., 29 Jul 2025, Chen et al., 2024, Lyu et al., 14 Jan 2026).
A user-centric agent, as defined and implemented in the current literature, is characterized by explicit optimization for user value, process-oriented and proactive support, deep memory-based personalization, rigorous user agency and privacy control, and sustained alignment over multi-turn, real-world interaction settings (Zhao et al., 26 Aug 2025, Lyu et al., 14 Jan 2026, Jia et al., 9 Oct 2025, Zhang et al., 17 Feb 2026, Eskenazi et al., 2019). The trajectory of research suggests progressive formalization of user value, modular agent design, human-in-the-loop co-adaptation, and governance structures that grant primacy to user goals over platform or agent objectives.