PAC-BENCH: Evaluating Multi-Agent Collaboration under Privacy Constraints

Published 13 Apr 2026 in cs.AI and cs.MA | (2604.11523v1)

Abstract: We are entering an era in which individuals and organizations increasingly deploy dedicated AI agents that interact and collaborate with other agents. However, the dynamics of multi-agent collaboration under privacy constraints remain poorly understood. In this work, we present $PAC\text{-}Bench$, a benchmark for systematic evaluation of multi-agent collaboration under privacy constraints. Experiments on $PAC\text{-}Bench$ show that privacy constraints substantially degrade collaboration performance and make outcomes depend more on the initiating agent than the partner. Further analysis reveals that this degradation is driven by recurring coordination breakdowns, including early-stage privacy violations, overly conservative abstraction, and privacy-induced hallucinations. Together, our findings identify privacy-aware multi-agent collaboration as a distinct and unresolved challenge that requires new coordination mechanisms beyond existing agent capabilities.

Abstract PDF Upgrade to Chat

Authors (8)

Summary

The paper introduces PAC-BENCH as a benchmark that evaluates the trade-offs between collaborative task success and privacy adherence in multi-agent scenarios.
It employs a turn-based framework with agent-specific private memories and robust metrics, including incremental task scores and holistic privacy compliance measures.
Empirical findings reveal that privacy constraints significantly degrade performance, with initiator dominance and early-stage violations undermining effective collaboration.

Evaluating Multi-Agent Collaboration under Privacy Constraints: PAC-BENCH

Motivation and Background

As AI agents transition from monolithic deployments to personalized and organizationally specialized agents, privacy emerges as a fundamental constraint rather than an ancillary consideration. PAC-BENCH addresses the critical gap in systematic evaluation for collaboration among private agents—agents mandated to service the interests and safeguard the information of specific owners—when explicit privacy constraints govern their interactions. Existing benchmarks for multi-agent systems focus on collective coordination and task completion but are misaligned with real-world requirements, where privacy-related restrictions preclude full observability and information sharing, engendering persistent information asymmetry and necessitating novel coordination strategies.

The core challenge investigated in this work is how collaborative performance and the fidelity of privacy constraint adherence interact and often conflict, highlighting a practical and theoretically significant axis central to agent deployments in sensitive environments.

Figure 1: Privacy-constrained multi-agent collaboration, with each agent preserving private memory and communicating actionable proposals while masking sensitive information.

Benchmark: Task Formulation and Scenario Construction

PAC-BENCH formalizes the problem as a turn-based LLM agent collaboration environment, where each episode is defined by two agents, agent-specific private memory, explicit privacy constraints, and a joint goal requiring non-trivial inter-agent coordination. At each time step, the active agent selects actions based not only on partial observations but also on internal memory and local privacy policies, resulting in a trajectory that is evaluated for both task completion and privacy compliance.

A robust scenario construction pipeline underpins the benchmark, leveraging requirement decomposition to generate agent-specific memories that reflect realistic domain practices and restricting information flows according to constraints grounded in established confidentiality and security standards (e.g., ISO/IEC 29100:2024). Each scenario passes through LLM-based and human validation to ensure analytical tractability, diversity, and the existence of genuine disclosure/collaboration trade-offs.

Figure 2: Turn-based evaluation framework where agents interact under explicit privacy constraints, and their trajectory is evaluated for success and privacy violations.

Figure 3: End-to-end scenario generation pipeline with human and rule-based refinement, constructing the PAC-Bench dataset.

Evaluation Metrics

PAC-BENCH introduces both incremental (partial) and holistic metrics. Task Score (TS) quantifies the fraction of satisfied collaborative requirements independent of privacy considerations, while Privacy Score (PS) assesses the degree of constraint adherence at each interaction, using automated LLM-based evaluators and human annotation for calibration. Holistic episode-level metrics require the simultaneous satisfaction of all task and privacy requirements ( $\mathrm{Acc}_{\mathrm{joint}}$ ), reflecting real-world deployment semantics where both utility and compliance are mandatory.

Experimental Results and Quantitative Findings

Experiments with state-of-the-art LLM agents (GPT-5.1, Claude-4.5-Sonnet, LLaMA-3.3-70B, Qwen-3-32B) reveal several robust and in some cases contradictory patterns:

Privacy constraints consistently result in substantial performance degradation across all models. Task scores decrease by up to 10–15 points relative to unconstrained baselines. Notably, the holistic joint accuracy (task and privacy) is typically less than 60% for leading models.
Collaboration efficacy is dominated by the initiating agent: performance is more strongly determined by which model initiates the solution than by the partner, highlighting a fundamental interaction asymmetry not apparent without privacy constraints.
Holistic episode-level success rates drop below 20% in tool-use scenarios, showing that tool manipulation under privacy constraints is a key Achilles' heel for current models.

Failure Modes in Privacy-Constrained Collaboration

Systematic error analysis uncovers recurring and non-trivial failure patterns:

Early-stage privacy violations (Figure 4): The overwhelming majority of privacy infractions occur during the first three turns of interaction, implicating initial information exchange protocols and unstable disclosure strategies as primary vulnerabilities.
Over-conservative abstraction: Agents, to avert privacy breaches, resort to excessive abstraction or omission, resulting in insufficient informativeness for their partners and impeding task resolution.
Privacy-induced hallucination: When direct disclosure is precluded, agents sometimes fabricate plausible but incorrect details, misleading the collaborator and generating invalid final outputs—41% of observed task failures exhibit this characteristic.
Figure 4: Illustration of failure modes—early privacy violations, over-conservative abstraction, and privacy-induced hallucinations induced by privacy constraints.

Ablations and Protocol Analyses

Explicit privacy prompting is necessary but not sufficient: omitting privacy instructions in system prompts sharply reduces privacy compliance, yet chain-of-thought privacy reasoning does not reliably mitigate task-privacy trade-offs. Protocol variations shifting initiative to the non-default agent do not attenuate the observed initiator dominance, countering simple hypotheses about turn-order artifacts and underscoring the intrinsic challenges of decentralization under asymmetric constraints.

Model Robustness and Collaboration Dynamics

Empirical analysis in the joint privacy-task space reveals that:

GPT-5.1 establishes a stable anchor in both task and privacy success metrics, reliably guiding collaboration regardless of partner model.
Collaborator selection induces high variance for non-GPT models, exacerbated under adversarial partner configurations, with success rates dropping precipitously in the presence of uncooperative or misaligned agents.
Figure 5: Distributions of model performance in the joint task-privacy space across agent pairings; GPT remains a robust anchor while other models show high variance.

Distinct privacy constraint types (range-based vs. change-based) affect difficulty but do not alter overall qualitative model ranking, with change-based constraints more frequently leading to failure.

Theoretical and Practical Implications

PAC-BENCH reveals that privacy-constrained multi-agent collaboration cannot be trivially solved by current LLM agents, even with strong task-solving architectures. The persistent gap highlights theoretical limits in current LLM-based coordination: information asymmetry and privacy constraints disrupt the iterative establishment of common knowledge, introduce unrecoverable early missteps, and impose new types of alignment and negotiation challenges. Failure analysis shows that models are not yet equipped with effective partial-disclosure, uncertainty calibration, or context-sensitive negotiation strategies.

On the practical side, these results expose deployment risks for organizational and personal agent systems where privacy is non-negotiable. Current agentic frameworks are likely to leak information, underperform, or hallucinate in ways that undermine both utility and trust without principled advances in privacy-aware reasoning and interaction protocols.

Future Research Directions

Progress will demand new architectures that embed privacy as a core inductive bias, refined prompt engineering for context-adaptive constraint reasoning, tool-use robustness under partial observability, and possibly the integration of explicit negotiation/justification phases early in the interaction protocol to preempt privacy-utility deadlocks. Realism will also require dynamic constraints and extensions to larger agent collectives, where coalition behaviors and indirect inference risks are amplified.

Conclusion

PAC-BENCH establishes privacy-aware multi-agent collaboration as a distinct, unsolved challenge that necessitates rethinking both agent architectures and evaluation paradigms. The benchmark's structured, systematic approach uncovers the sharp tension between collaboration and constraint, exposes the fragility of existing solutions, and offers a foundation for developing agents equipped to deploy in settings where privacy, ownership, and interactively conditioned behavior are indispensable system properties.