Operationalizing Inter-Agent Trust as a First-Class Security Variable in LLM-Based Multi-Agent Systems

Characterize and govern inter-agent trust in large language model-based multi-agent systems by formally defining and auditing its strength, scope, and revocability, and determine how these dimensions affect exposure risk and error propagation relative to the Minimum Necessary Information principle.

Background

The paper introduces the Trust-Vulnerability Paradox (TVP), arguing that increased inter-agent trust, while improving coordination, simultaneously heightens risks of over-exposure and over-authorization in LLM-based multi-agent systems. The authors position trust not as a mere social assumption but as a controllable operational variable whose intensity and boundaries influence sensitive information disclosure and error propagation.

Within this framing, the authors call out the need to explicitly model trust with parameters such as strength, scope, and revocability, and to treat these as auditable, schedulable security variables. Their empirical work and metrics (e.g., Over-Exposure Rate and Authorization Drift) are presented as steps toward addressing this open problem, but the challenge of formally defining, auditing, and governing trust’s dimensions and their effects remains explicitly identified as open.

References

The open problem. In LLM-based multi-agent systems, "trust" is not merely a social assumption but an operational control variable: its strength (how much one agent accepts another’s claims), scope (what information/actions are authorized), and revocability (how fast permissions can be withdrawn) jointly shape exposure and error propagation.

The Trust Paradox in LLM-Based Multi-Agent Systems: When Collaboration Becomes a Security Vulnerability (2510.18563 - Xu et al., 21 Oct 2025) in Introduction (Section 1), paragraph titled "The open problem."