Learning Multi-Agent Communication Protocol: Study on Information Entropy Efficiency in MARL

Published 5 Jun 2026 in cs.MA | (2606.07200v1)

Abstract: Multi-Agent Systems (MAS) have emerged as a fundamental paradigm for distributed problem-solving, where autonomous agents collaborate to achieve complex objectives. Within this framework, Multi-Agent Reinforcement Learning (MARL) with communication has demonstrated remarkable success in cooperative tasks. However, existing approaches predominantly pursue performance gains through increasingly complex architectures and expanding communication overhead, lacking principled metrics to evaluate the efficiency of information exchange. In this paper, we focus on enabling agents to learn efficient multi-agent communication protocols that balance performance and information compactness. We propose the Information Entropy Efficiency Index (IEI), a novel metric that quantifies the ratio between message entropy and task performance in learned communication protocols. A lower IEI indicates more compact and efficient message representations. By incorporating IEI into training loss functions, we encourage agents to develop communication protocols that achieve high performance with improved communication efficiency. Extensive experiments across diverse MARL algorithms demonstrate that our approach achieves equivalent or superior task performance compared to baseline methods while improving communication efficiency. These findings challenge the prevailing assumption that performance improvements require complex architectures or increased communication overhead and highlight the potential of improving both task success and communication efficiency to enable scalable MAS.

Abstract PDF Upgrade to Chat

Authors (3)

Summary

The paper introduces the IEI metric to measure and optimize task-relevant message compactness in multi-agent reinforcement learning.
It develops a multi-round CTDE framework that dynamically encodes, aggregates, and refines messages to simultaneously boost coordination and efficiency.
Experimental results show that IEI-driven training reduces communication overhead and improves performance compared to traditional multi-round protocols.

Information Entropy Efficiency in Multi-Agent Communication Protocols for MARL

Introduction

Efficient inter-agent communication is a foundational challenge in Multi-Agent Reinforcement Learning (MARL). Existing research disproportionately focuses on task performance improvements, typically increasing network complexity and communication overhead without considering efficiency or practical deployment constraints. This paper systematically addresses this deficiency by introducing the Information Entropy Efficiency Index (IEI), a metric quantifying task-relevant message compactness, and incorporates this index into MARL optimization objectives for simultaneous gains in both coordination quality and communication efficiency (2606.07200).

Generalized MARL Communication Framework

The proposed framework formalizes multi-round, learned communication paradigms within centralized training and decentralized execution (CTDE). Each agent processes local observations into encoded hidden states, participates in up to $L$ structured communication rounds governed by a dynamic topology mechanism, and applies aggregation and update functions to iteratively refine internal states prior to policy execution.

Figure 1: Illustration of the message encoding, topology selection, aggregation, and update pipeline in multi-round MARL communication frameworks.

Empirical evaluation across five baselines (MAGIC, CommNet, TarMAC, GA-Comm, and IC3Net) in the Traffic Junction benchmark empirically establishes that additional communication rounds monotonically enhance coordination and final success rates but incur significant bandwidth and latency penalties, highlighting the need for principled efficiency metrics and policies.

Figure 2: Success rate advantage conferred by increasing communication rounds ( $L=1$ versus $L=2$ ) in baseline MARL algorithms.

Information Entropy Efficiency Index (IEI)

The IEI is defined as the ratio of average message entropy to task success rate: $\Phi_{\text{IEI}_t} = H_t / \mathscr{S}_t$ , with $H_t$ capturing mean agent message entropy per epoch, aggregated over rounds and agents. This formulation operationalizes communication efficiency as a direct learning target, reversing the conventional bias toward quantity over compactness.

Experimental application demonstrates algorithm-dependent convergence dynamics, with some methods (e.g., TarMAC, MAGIC) showing rapid early-stage entropy compaction and others (e.g., IC3Net) ultimately attaining lower final entropy at the expense of slower convergence.

Figure 3: Comparative trends in $\Phi_{\text{IEI}}$ across algorithms reveal heterogeneous efficiency-improvement profiles and learning dynamics.

Figure 4: Training progression visualizes message distribution evolution: high-variance, high-entropy encodings progressively coalesce into compact, regular structures under the proposed framework.

Joint Optimization of Performance and Efficiency

IEI is incorporated into a composite objective via a regularization-enhanced loss:

$\mathcal{L}_t = l_{\mathbf{a}^t} + w_ql_{Q_t} + w_{\text{IEI}_t}\Phi_{\text{IEI}_t}$

A dynamic adjustment mechanism scales the regularization weight in response to real-time success and entropy, prioritizing task completion during early/unstable training but shifting emphasis toward efficiency as performance stabilizes. Sensitivity studies delineate the effects of regularization parameters $(\alpha, \beta)$ , confirming robustness yet emphasizing the necessity for calibrated parameter selection.

Figure 5: Success rate and $\Phi_{\text{IEI}}$ trajectories for loss-augmented versus conventional learning: joint optimization can accelerate convergence, boost end performance, and achieve lower communication entropy.

Figure 6: Sensitivity analysis of $\alpha$ (loss weight) and $L=1$ 0 (success-scaling) demonstrates optimal regions for improved trade-offs but exposes instability under mis-calibration.

Communication Cost, Efficiency, and Pareto Analysis

Further results on total communication burden per epoch demonstrate that the proposed IEI-driven learning maintains low message overheads comparable to single-round baselines, while matching or outperforming multi-round approaches in maximum performance metrics.

Figure 7: Message count per epoch: communication overhead for IEI-augmented single-round approaches is stable and nearly as low as the minimal baseline, unlike multi-round protocols.

Evaluation of communication efficiency (success per million messages) validates that IEI-augmented policies universally dominate; communication efficiency is maximized without resorting to higher-round strategies.

Figure 8: Communication efficiency (performance per message) consistently highest for single-round IEI-augmented protocols.

Pareto analysis clarifies that IEI-enhanced methods (L=1, w/ IEI) form the optimal frontier—jointly minimizing cost and maximizing success—across all evaluated MARL domains and architectures.

Figure 9: Pareto frontier in the communication cost vs. performance plane; IEI-enhanced settings define the optimal bound, obviating the need for multi-round schemes.

Theoretical and Practical Implications

This work demonstrates that multi-agent coordination improvements are not strictly a function of expanded communication bandwidth or architectural complexity. By casting communication compactness as an explicit optimization target and employing dynamic regularization, agents autonomously discover efficient, low-entropy protocols that support scalable MAS deployment under practical resource constraints.

The IEI metric enables systematic, reproducible evaluation and comparison of MARL communication effectiveness, supporting future studies on both algorithmic development and deployment-case analyses (e.g., bandwidth-constrained robotic collectives, sensor networks).

Extension to other tasks, communication structures, or more autonomous topology learning is immediate and warrants further research.

Conclusion

This paper formalizes, implements, and empirically validates the Information Entropy Efficiency Index (IEI) as a central tool for developing and evaluating communication-efficient MARL protocols. The IEI-driven optimization framework delivers near-optimal trade-offs between coordination quality and communication cost, strongly challenging the prevailing assumption that increased performance requires either deeper networks or greater communication bandwidth. These innovations advance scalable, deployable MAS and clarify open questions regarding the fundamental nature of learned communication under real-world constraints (2606.07200).

Markdown Report Issue