Multi-agent Memory Systems

Updated 17 September 2025

Multi-agent memory systems are architectures that enable collections of intelligent agents to collaboratively store, update, and retrieve information over dynamic interactions.
Key designs include layered memory models, graph-based search, and consensus protocols that ensure secure data sharing and coordinated learning.
Recent approaches employ admission scoring, collaborative pruning, and cross-agent synchronization to optimize memory utility while minimizing redundancy.

Multi-agent Memory Systems comprise the set of architectural, algorithmic, and formal strategies by which collections of intelligent agents store, retrieve, revise, and propagate information over time and interaction. Unlike single-agent scenarios—where memory typically serves only an individual—multi-agent systems (MAS) require memory solutions that support decentralized coordination, robust consensus, scalable learning, layered privacy, verifiability, secure sharing, and dynamic evolution across heterogeneous tasks and domains. Recent research addresses these requirements through innovations in memory abstraction, storage topology, admission control, retrieval optimization, security, and cross-agent synchronization.

1. Foundations and Key Motivations

At the core of multi-agent memory systems is the need for agents to collectively process and retain knowledge acquired through local observations, communications, and cumulative experiences over extended interactions. Memory in this context is not only a repository of past states and outcomes but a structural enabler for capabilities such as:

Consensus achievement in distributed settings under resource limitations (Natale et al., 2019).
Coordination and efficient policy learning in partially observable, decentralized reinforcement learning (Zhou et al., 2019, Wang et al., 2022, Sagirova et al., 22 Jan 2025).
Secure, privacy-preserving, and auditable sharing of sensitive or privileged information (Mao et al., 6 Mar 2025, Rezazadeh et al., 23 May 2025).
Lifelong and open-ended learning, continual reasoning, and adaptability across domains (Han et al., 5 Feb 2024, Xu et al., 11 Sep 2025).

Memory architectures are therefore designed to balance several pressures: computational and storage bounds per agent, the complexity of context integration, the need for robust collaboration without centralized control, and guarantees of correctness, privacy, and resilience.

2. Models of Memory Structure and Access

Multi-agent systems employ various memory architectures, often layered or modular, to separate local and shared knowledge. Common structural patterns include:

Layered Memory Model (Han et al., 5 Feb 2024, Wang et al., 10 Jul 2025):

Short-term (Sᵢ): Volatile, task- or session-scoped context.
Long-term (Lᵢ): Persistent storage for historical interactions, typically offloaded to external vector DBs or structured archives.
Episodic/Contextual: Sequences of interactions, capturing temporally or contextually cohesive episodes for later retrieval.
Consensus or Shared Memory (C): Repository accessible across agents for skill transfer, cross-agent alignment, and task-wide state propagation.

A general formalism:

$R_i = f(S_i, L_i, E_i, C)$

where $R_i$ is the aggregated memory view for agent $i$ , $E_i$ represents contributions from external sources (e.g., RAG), and $C$ is the consensus memory (Han et al., 5 Feb 2024).

Hierarchical and Modular Memory:

Approaches such as MIRIX (Wang et al., 10 Jul 2025) implement specialized managers for distinct memory types—core, episodic, semantic, procedural, resource, and knowledge vault. Each component is accessed and updated via a routing and retrieval mechanism coordinated by a meta memory manager.

Graph-Structured and Cross-Tier Architectures:

G-Memory (Zhang et al., 9 Jun 2025) employs a three-tier graph comprising interaction graphs (utterances), query graphs (task instances), and insight graphs (generalized knowledge). Bi-directional traversals provide both abstraction and specificity, with new experiences assimilated at all levels to drive progressive team evolution.

Memory Sharing and Pooling:

Frameworks like Collaborative Memory (Rezazadeh et al., 23 May 2025) and domain-specific shared memory in SRMT (Sagirova et al., 22 Jan 2025) or FCMNet (Wang et al., 2022) pool fragments or vectorized states across agents, subject to access control, so that relevant context and coordination cues are globally available, often via cross-attention or multi-hop communication.

3. Memory Admission, Update, and Pruning Strategies

Effective memory systems address uncontrolled growth, noise accumulation, and relevance drift via explicit mechanisms for write admission, update scheduling, and collaborative pruning.

Verifiable Write Admission (Xu et al., 11 Sep 2025): Memory acceptance is predicated on reproducible, environment-free replay in self-contained execution contexts (SCEC), supporting A/B testing:

For candidate $m$ , compute composite admission score:

$S = \Delta R - \lambda_L \Delta L - \lambda_T \Delta T$

where $\Delta R$ is improvement in reward, $\Delta L$ in latency, $\Delta T$ in token usage. Admission is allowed if $S \geq \eta$ . This ensures only empirically beneficial experiences propagate to memory.

Self-Scheduling and Utility-Based Memory Curation (Xu et al., 11 Sep 2025): Memory controllers monitor access frequency and utility, adjusting weights via

$w_{t+1}(m) = w_t(m) + \alpha \bar{U}_t(m) - \beta f_{use,t}(m)$

Only high-utility entries survive; near-duplicate memories are consolidated to control growth.

Synchronized Pruning Protocols (Bach, 19 Jun 2025): The Co-Forgetting protocol blends semantic voting (DistilBERT-based), multi-scale temporal decay

$D_i(t) = \exp\left(-\frac{t - t_{\text{last}}(m)}{S_i}\right)$

and PBFT-backed consensus to safely discard obsolete or low-utility memory items. This achieves consistent memory reduction and robust fault tolerance across agents.

4. Access Control, Security, and Auditability

With the growing use of MAS in sensitive and dynamic applications, robust memory access and modification policies are essential.

Hierarchical Data Management (Mao et al., 6 Mar 2025): AgentSafe classifies memory fragments by explicit security levels ( $\ell(s_{i,k})$ ). Reads and writes are validated via multi-step workflows (ThreatSieve and HierarCache) that jointly enforce role- and content-based access:

$U(v_i, v_j, m, \ell)$ supports both permission checks and semantic validation.
Unauthorized or malicious writes are diverted to isolated "junk" memory; valid entries propagate only if both sender and receiver permissions align and semantic checks succeed.

Granular, Policy-Driven Sharing (Rezazadeh et al., 23 May 2025): Collaborative Memory maintains distinct private and shared memory tiers, with provenance attributes (originating user, agent set, resource access, timestamp) and bipartite graphs modeling dynamic, asymmetric permissions. Auditability and compliance are enforced via retrospective permission checks on every memory operation.

Consensus and Memory in Security-Critical Consensus Protocols (Natale et al., 2019, Bach, 19 Jun 2025): Robustness is further supported by lower bound memory requirements for distributed consensus, and mechanisms ensuring only qualified majorities can alter shared state, even when a fraction of agents are Byzantine.

5. Coordination and Retrieval in Large-Scale, Heterogeneous Environments

Scalable multi-agent memory requires steering between the competing imperatives of comprehensive context and resource efficiency.

Efficient Routing and Selection (Liu et al., 6 Aug 2025): RCR-Router provides role-aware, token-budgeted context selection, determining for each agent $A_i$ the subset $C^t_i$ maximizing

$\sum_{m \in C'} \alpha(m; R_i, S_t)$

subject to a strict token budget $B_i$ . This is achieved via importance-driven ranking and iterative context refinement.

Memory Retrieval Mechanisms:

Composite scoring (embedding, BM25, string match) for multimodal and multi-memory retrieval in MIRIX (Wang et al., 10 Jul 2025).
Cosine similarity matching for retrieval and context memory in MMS (Zhang et al., 21 Aug 2025):

$\text{cos\_sim}(q, v) = \frac{q \cdot v}{\|q\| \|v\|}$

Selected memory units are then paired with contextual augmentations for high-quality generation.

Graph-based search (bi-directional traversal, hop expansion, sparsification) in G-Memory (Zhang et al., 9 Jun 2025) for multilevel, role-targeted recall.

Distributed and Pooled Memory:

SRMT (Sagirova et al., 22 Jan 2025) demonstrates implicit broadcast of memory vectors via cross-attention, enabling agents to coordinate without explicit message passing—a principle also leveraged by FCMNet (Wang et al., 2022) in RL settings.

6. Reasoning, Lifelong Learning, and Adaptation

Cross-domain generalization, continual learning, and collective reasoning are enabled via:

Cross-domain diffusion, with memory abstraction (removal of specifics to retain reusable structures) and conservative weight transfer (Xu et al., 11 Sep 2025).
Hierarchical insight propagation, with upward (generalizing) and downward (specializing) synthesis (Zhang et al., 9 Jun 2025).
Episodic and semantic memory interplay for domain adaptation and iterative verification cycles (Flores et al., 14 Aug 2025).

Empirical findings show improvements in multi-hop reasoning, reduction in hallucinations, superior adaptation to new task regimes, and elevated team performance in QA and embodied tasks (Zhang et al., 9 Jun 2025, Xu et al., 11 Sep 2025, Wang et al., 10 Jul 2025, Zhang et al., 21 Aug 2025).

7. Challenges and Implications

Persistent issues include drift and relevance loss in large memory stores, emergent redundancy, audit costs, complex context layering, latency versus recall trade-offs, and maintaining security/privacy under adversarial pressure. Several frameworks recommend:

Fine-tuned, token-constrained context routing (Liu et al., 6 Aug 2025).
Semi-automatic consolidation and self-healing memory (Xu et al., 11 Sep 2025).
Use of domain-specific consensus, fraud detection, and anomaly identification mechanisms layered atop episodic or shared memory (Han et al., 5 Feb 2024).

Future developments are expected to focus on modular plug-and-play memory components, improved sparsification and abstraction, learned fine-grained access and sharing policies, and systematic integration of verification and provenance in multi-agent collaboration.

The field of multi-agent memory systems advances via a convergence of verifiable utility maximization, layered and structured context management, and robust, policy-driven sharing—enabling scalable, accountable, and adaptive memory across open-ended, decentralized intelligence.