Monitoring LLM-based Multi-Agent Systems Against Corruptions via Node Evaluation

Published 22 Oct 2025 in cs.CR, cs.AI, cs.LG, cs.MA, and math.OC | (2510.19420v1)

Abstract: LLM-based Multi-Agent Systems (MAS) have become a popular paradigm of AI applications. However, trustworthiness issues in MAS remain a critical concern. Unlike challenges in single-agent systems, MAS involve more complex communication processes, making them susceptible to corruption attacks. To mitigate this issue, several defense mechanisms have been developed based on the graph representation of MAS, where agents represent nodes and communications form edges. Nevertheless, these methods predominantly focus on static graph defense, attempting to either detect attacks in a fixed graph structure or optimize a static topology with certain defensive capabilities. To address this limitation, we propose a dynamic defense paradigm for MAS graph structures, which continuously monitors communication within the MAS graph, then dynamically adjusts the graph topology, accurately disrupts malicious communications, and effectively defends against evolving and diverse dynamic attacks. Experimental results in increasingly complex and dynamic MAS environments demonstrate that our method significantly outperforms existing MAS defense mechanisms, contributing an effective guardrail for their trustworthy applications. Our code is available at https://github.com/ChengcanWu/Monitoring-LLM-Based-Multi-Agent-Systems.

Abstract PDF Upgrade to Chat

Summary

The paper proposes a dynamic defense paradigm that reconstructs MAS as directed acyclic graphs and uses backward propagation to detect malicious communication edges.
It employs a novel contribution extraction method using signed network evaluation to reliably assign scores to each communication edge.
Experimental results demonstrated a 93% detection success rate, outperforming defenses such as G-Safeguard and AgentXposed.

Monitoring LLM-based Multi-Agent Systems Against Corruptions via Node Evaluation

Introduction

LLMs have increasingly been integrated into complex Multi-Agent Systems (MAS), serving as the core to facilitate communication among agents. This integration, although enhancing capabilities, raises trustworthiness concerns due to potential corruption attacks. Unlike single-agent systems, MAS are vulnerable to complex communication processes that can be exploited through dynamic and evolving attacks. This paper proposes a dynamic defense paradigm that continuously monitors MAS communication graphs, disrupts malicious communications, and dynamically adjusts the graph topology to defend against these attacks effectively.

Figure 1: An overview of our method. In step 1, we reconstruct the MAS as a directed acyclic graph (DAG). In steps 2 and 3, we extract the contribution of each agent to the final decision using the contribution score on each edge and backward propagation from the final decision. This helps determine the latent malicious agents. We then repair the MAS by removing information sent from the detected malicious agents in step 4. The dashed line indicates that the communication edge has been deleted.

Methodology

The proposed method treats the MAS as a directed acyclic graph (DAG), where nodes represent agents and directed edges represent communications between them. This model allows for a comprehensive analysis of communication dynamics and enables the detection of malicious agents through a novel backpropagation technique.

MAS Graph Model: The MAS is modeled as a DAG to cater to computational convenience, with nodes representing agents at different time steps and edges representing directed communications between them.

Contribution Extraction: A signed network is utilized to evaluate contributions on each communication edge. The sign of an edge indicates the nature of its contribution—positive, negative, or neutral. This evaluation leverages an independent LLM to maintain consistency and reliability.

Backward Propagation for Detection: The contribution of each agent is computed using backward propagation across the signed network. This process identifies extreme deviation in contribution scores, indicative of malicious intent, allowing for the dynamic adjustment of the graph by removing harmful communications.

Experimental Results

Extensive experiments across various MAS configurations and datasets demonstrated the superiority of the proposed method over existing defense mechanisms. In experiments on complex tasks using the MMLU dataset, the method achieved an average detection success rate of 93%, significantly outperforming baselines such as G-Safeguard and AgentXposed. Under different attack strategies, the method maintained a robust defense performance, with accuracy dropping minimally compared to unprotected systems.

Implications and Future Work

This research contributes a dynamic and effective defense mechanism for MAS, addressing evolving security threats through a fine-grained node evaluation technique. The implications extend to enhancing the resilience of collaborative AI systems against corruption. Future research could explore adaptive threshold mechanisms and real-time application in expansive MAS environments, further solidifying the method's practical viability. The insights gained could inform the development of more advanced protection strategies that incorporate real-time detection and dynamic graph adaptations for enhanced security in AI systems.

Conclusion

The paper presents a robust method for safeguarding LLM-based MAS by dynamically evaluating node contributions to detect and neutralize malicious agents. The methodology redefines MAS graph defenses, offering an adaptive and proactive solution against increasingly sophisticated adversarial strategies. Through comprehensive evaluations, the proposed approach not only outperformed current defenses but also demonstrated significant potential for broad applications in real-world scenarios. The integration of this method paves the way for secure and trustworthy applications of MAS, crucial for the advancement of collaborative AI technologies.

Markdown

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Paper Prompts

Top Community Prompts

Explain it Like I'm 14

off on

Knowledge Gaps

off on

Practical Applications

off on

Glossary

off on

Conceptual Simplification

off on

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Generate Now

Continue Learning

Authors (5)

Collections

GitHub

GitHub - ChengcanWu/Monitoring-LLM-Based-Multi-Agent-Systems

Monitoring LLM-based Multi-Agent Systems Against Corruptions via Node Evaluation

Summary

Monitoring LLM-based Multi-Agent Systems Against Corruptions via Node Evaluation

Introduction

Methodology

Experimental Results

Implications and Future Work

Conclusion

Paper to Video (Beta)

Whiteboard

Paper Prompts

Top Community Prompts

Open Problems

Continue Learning

Related Papers

Authors (5)

Collections

GitHub

Tweets