KVComm: Enabling Efficient LLM Communication through Selective KV Sharing

Published 2 Oct 2025 in cs.LG, cs.AI, and cs.MA | (2510.03346v1)

Abstract: LLMs are increasingly deployed in multi-agent systems, where effective inter-model communication is crucial. Existing communication protocols either rely on natural language, incurring high inference costs and information loss, or on hidden states, which suffer from information concentration bias and inefficiency. To address these limitations, we propose KVComm, a novel communication framework that enables efficient communication between LLMs through selective sharing of KV pairs. KVComm leverages the rich information encoded in the KV pairs while avoiding the pitfalls of hidden states. We introduce a KV layer-wise selection strategy based on attention importance scores with a Gaussian prior to identify the most informative KV pairs for communication. Extensive experiments across diverse tasks and model pairs demonstrate that KVComm achieves comparable performance to the upper-bound method, which directly merges inputs to one model without any communication, while transmitting as few as 30\% of layers' KV pairs. Our study highlights the potential of KV pairs as an effective medium for inter-LLM communication, paving the way for scalable and efficient multi-agent systems.

Abstract PDF Upgrade to Chat

Summary

The paper introduces KVComm, a framework that leverages attention scores and Gaussian priors to selectively share key-value pairs for efficient LLM communication.
It reduces computational overhead and addresses hidden state bias by focusing on intermediate layer representations, ensuring effective data transmission.
Experimental results show that KVComm outperforms baseline methods, achieving competitive performance with significantly lower data transmission requirements.

Introduction

The paper "KVComm: Enabling Efficient LLM Communication through Selective KV Sharing" (2510.03346) addresses the inefficiencies in inter-LLM communication within multi-agent systems. Traditional methods either rely on natural language, which incurs high inference costs and information loss, or on hidden states, which suffer from information concentration bias and inefficient data sharing. The KVComm framework proposes a novel approach by leveraging selective sharing of KV pairs, which offers a balance between efficiency and effectiveness in communicating rich semantic information without interacting directly with hidden states.

Motivation

The motivation behind KVComm lies in overcoming the shortcomings of existing communication protocols. The inefficiency of natural LLMs and the bias toward the concentration of hidden state information necessitates a more robust framework. KV pairs provide a representative form of activation information across layers, allowing models to utilize the encoded information efficiently through attention mechanisms.

KVComm Framework

KVComm adopts a selection strategy based on attention importance scores and Gaussian priors to identify the most informative KV pairs for communication.

Figure 1: KVComm framework for efficient LLM communication through selective KV sharing.

Problem Formulation

In scenarios where LLMs collaboratively solve tasks, $\mathcal{M}_s$ processes context $C$ , generating information $I_C$ to be transmitted. $\mathcal{M}_r$ uses $I_C$ , combined with a query $Q$ , to produce the final output. The communication protocol must be efficient while transmitting minimal data without sacrificing effectiveness.

Limitations of Previous Approaches

Hidden states have been proposed before as a medium, yet they fail due to biases in information concentration, predominantly in the last token's hidden state in later layers. This bias leads to significant information loss when used for communication. Additionally, methods relying on all tokens' hidden states face dilemmas of computational inefficiency or performance degradation.

Figure 2: Compared to other token positions, the last token's hidden state is the most critical, especially in later layers.

KVComm circumvents these challenges by selectively sharing KV pairs determined from attention scores and semantic richness derived from intermediate layers. This selection ensures efficiency and retains essential context information.

KV Selection Strategies

The selection leverages hypotheses regarding knowledge embedding in intermediate layers and attention distribution as proxies for communication value.

Figure 3: Effective communication with limited hyperparameters.

Attention importance scores average context token weights across layers, refining the selection process using a Gaussian center spread approach, enhancing intermediate layer selection efficacy.

Figure 4: Better communication performance with higher attention level.

Experimental Results

KVComm demonstrates significant reductions in computational complexity compared to baselines, while maintaining or enhancing performance. It consistently outperforms methods like Skyline and NLD by achieving reduced data transmission requirements.

Figure 5: Llama-3.2-3B on MMLU Social Science

The experiments underscore KVComm’s ability to achieve performance comparable to direct input merging methods, with drastically reduced communication overhead. Such results manifest KVComm’s practical utility in scalable multi-agent systems.

Conclusion

KVComm presents a paradigm shift in efficient LLM communication through selective KV sharing, overcoming existing challenges in hidden state bias and natural language inefficiencies. By focusing on intermediate layers and utilizing attention scores, KVComm establishes a foundation for future developments in efficient inter-LLM communication strategies, balancing computational demands with rich semantic data sharing.

The broader implications of KVComm lie in its potential to enhance collaborative problem-solving capabilities in AI-driven multi-agent environments, paving the way for more scalable and efficient systems.

This approach opens avenues for further research in optimizing communication protocols by blending KVComm's principles with existing methods to address varying complexities in real-world applications.

Markdown

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Paper Prompts

Top Community Prompts

Explain it Like I'm 14

off on

Knowledge Gaps

off on

Practical Applications

off on

Glossary

off on

Conceptual Simplification

off on

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Generate Now

KVComm: Enabling Efficient LLM Communication through Selective KV Sharing

Summary

Introduction

Motivation

KVComm Framework

Problem Formulation

Limitations of Previous Approaches

KV Selection Strategies

Experimental Results

Conclusion

Paper to Video (Beta)

Whiteboard

Paper Prompts

Top Community Prompts

Open Problems

Continue Learning

Authors (5)

Collections

KVComm: Enabling Efficient LLM Communication through Selective KV Sharing

Summary