Multi-User Large Language Model Agents

This presentation examines a critical gap in large language model deployment: their inability to effectively serve multiple users simultaneously. While LLMs excel in single-user scenarios, they struggle when faced with conflicting objectives, information asymmetry, and privacy constraints inherent in multi-user settings. Through systematic stress testing across instruction following, privacy preservation, and coordination tasks, this work reveals fundamental limitations in current models and establishes a framework for evaluating multi-principal decision problems in collaborative AI systems.
Script
Language models are everywhere in collaborative tools, from Slack to email clients to project management platforms. But there's a problem hiding in plain sight: these systems were built for one user at a time, and they break down when multiple people with conflicting needs try to use them together.
The researchers formalize this as a multi-principal decision problem. Unlike single-user scenarios where the model optimizes for one fixed objective, multi-user settings demand role-aware reasoning, selective context sharing, and cross-user coordination. Current models simply weren't designed for this.
To reveal these limitations, the authors designed three stress tests that push models to their breaking point.
When users issue conflicting instructions, models fail to maintain stable prioritization. Privacy violations accumulate across interactions as models leak information between users. Coordination tasks like meeting scheduling require exponentially more turns as the group grows, creating severe efficiency bottlenecks.
This visualization captures the scalability crisis. The blue line shows success rates under full information disclosure, while red shows partial disclosure where users have private constraints. Notice how success drops sharply beyond 10 users in the partial disclosure setting, precisely when real-world coordination becomes challenging. Meanwhile, the number of turns required to reach consensus climbs steadily, demonstrating that current models solve multi-user problems through brute-force iteration rather than intelligent reasoning.
These aren't academic edge cases. Organizations are deploying language models into collaborative environments right now, where serving multiple users simultaneously is the norm, not the exception. Without principled solutions to conflict resolution and privacy preservation, these deployments risk systematic failures that undermine user trust and operational efficiency.
The single-user paradigm that shaped language model development is meeting its limits. To learn more about multi-principal decision problems and create your own research video, visit EmergentMind.com.