Adaptive Memory via Multi-Agent Collaboration

Updated 4 February 2026

Adaptive Memory via Multi-Agent Collaboration is a paradigm that employs coordinated agents to manage hierarchical memory modules and support robust, role-aware reasoning.
The framework integrates multi-tier memory architectures, adaptive retrieval strategies, and verifiable memory admission to optimize token usage and system performance.
AMA systems demonstrate practical improvements in efficiency and scalability through role-specific processing, decentralized coordination, and dynamic memory updates.

Adaptive Memory via Multi-Agent Collaboration (AMA) is a paradigm that leverages the coordination of multiple agents—often empowered by LLMs—to enable robust, flexible, and role-aware long-term memory systems that optimize collective reasoning, planning, and adaptation in complex environments. AMA research integrates hierarchical memory modeling, distributed retrieval, verifiable memory admission, and specialized agentic protocols, targeting performance and scalability constraints in LLM-powered multi-agent systems (MAS).

1. Hierarchical and Modular Architectures

AMA frameworks typically decompose memory processing and collaboration into structured modules or tiers, enabling specialization and efficient management of context across multiple scales.

Multi-Tier Memory Hierarchies: Cutting-edge frameworks such as AMA (Huang et al., 28 Jan 2026), G-Memory (Zhang et al., 9 Jun 2025), and MLC-Agent (Zhang et al., 27 Jul 2025) maintain complementary stores: raw episodic text, fine-grained facts, summarized episodes, knowledge graphs, and role-specific latent memories.
- Example: AMA (Huang et al., 28 Jan 2026) organizes Raw Text Memory ( $M_{\text{raw}}$ ), Fact Knowledge Memory ( $M_\text{fact}$ ), and Episodic Memory ( $M_\text{epi}$ ), each indexed by dense vector encodings; G-Memory constructs three interlinked graph layers (Interaction, Query, Insight), supporting bi-directional retrieval and cross-trial learning.
- Compact Latent Memory: LatentMem (Fu et al., 3 Feb 2026) employs an “Experience Bank” and a learnable Memory Composer $\sigma_\phi$ that generates compact, role-aware latent tokens, mitigating information overload and homogenization.
Multi-Agent Roles: Memory management is distributed among specialized agents (Constructor, Retriever, Judge, Refresher, Manager, etc.) that coordinate extraction, allocation, consistency verification, and targeted updates (Huang et al., 28 Jan 2026, Zhang et al., 30 Jan 2026).
- Manager-Member Hierarchies: MiTa (Zhang et al., 30 Jan 2026)—a hierarchical manager-member system—introduces central Allocation and Summary modules, while member agents focus on perception, local memory, and negotiation.
Cross-Modal, Semantic, and Procedural Integration: Adaptive knowledge graph systems (Yang et al., 8 Feb 2025) and both procedural/semantic memory store architectures (Kim et al., 2022) support multi-modal representations (text, symbolic, visual), procedural facts, and dynamic semantic abstractions.

2. Memory Construction, Retrieval, and Adaptive Routing

AMA systems implement adaptive, context-sensitive retrieval algorithms that route queries to memory subsystems matching the required granularity and specificity.

Granularity-Adaptive Retrieval: AMA's Retriever module determines, via intent classification and learned budgets, whether queries should access raw logs, factual entries, or high-level summaries, optimizing for precision and avoiding retrieval noise (Huang et al., 28 Jan 2026). The dynamic cutoff

$K = \max(K_\text{dyn}, K_\text{min})$

ensures that retrieval matches query complexity.
Hierarchical Traversal in Graph Memories: In G-Memory, bi-directional traversal supports coarse-to-fine memory surfacing: high-level insights guide cross-trial transfer, while subgraph retrieval isolates relevant inter-agent trajectories (Zhang et al., 9 Jun 2025).
Conflict and Consistency Verification: Specialized Judge agents iteratively verify relevance and detect logical conflicts, invoking Refresher agents for targeted updates or deletions (Huang et al., 28 Jan 2026). This iterative process maintains long-term consistency at low token cost (≈80% reduction in context size).
Adaptive Indexing and Search: Systems such as SAMEP (Masoor, 5 Jul 2025) and SEDM (Xu et al., 11 Sep 2025) leverage vector-based indices and hybrid scoring—balancing semantic similarity, recency, and access priorities—to select context for current tasks, and adapt embeddings post-reuse by incremental update.

3. Memory Maintenance, Consolidation, and Growth Control

AMA frameworks actively control the evolution of memory, ensuring high utility, bounded growth, and generalization.

Verifiable Write Admission: SEDM (Xu et al., 11 Sep 2025) injects only empirically validated (“A/B tested”) snippets into the global store, assigning utility-based weights:

$S = \Delta R - \lambda_L \Delta L - \lambda_T \Delta T$

where $\Delta R$ is reward gain, $\Delta L$ is latency, and $\Delta T$ is token consumption.
Self-Scheduling and Pruning: Memory controllers dynamically score and rank memories at retrieval (hybrid of similarity and empirical utility), promoting, decaying, or merging entries based on realized benefit and frequency of use.
Hierarchical Summarization and Compaction: Episodic summarization condenses long collaborations into high-utility summaries at milestones (Zhang et al., 30 Jan 2026, Zhang et al., 9 Jun 2025), enabling efficient long-horizon adaptation.
Memory Diffusion and Generalization: Abstraction mechanisms propagate distilled, type-generic memories across agents and domains (Xu et al., 11 Sep 2025), supporting transfer and avoiding contamination from irrelevant specifics.

4. Role-Aware and Task-Oriented Adaptation

AMA research emphasizes memory adaptation to agent role and task context, moving beyond undifferentiated context dumps to customized, high-utility retrieval.

Role-conditioned Latent Bottlenecks: LatentMem's Memory Composer $\sigma_\phi(\gamma_{\alpha_j}, \mathcal{T}_q)$ injects role profile embeddings into latent memory construction, verified empirically to avoid collapse into homogeneous representations and mitigating “one-size-fits-all” inefficiency (Fu et al., 3 Feb 2026).
Task-aware Adversarial Construction: “Adversarial Memory Adaptation” (AMA) (Deng et al., 29 Jan 2026) leverages Challenger, Evaluator, and Adapter agents to simulate downstream QA during the (offline) memory update stage, aligning constructed memories with future reasoning requirements and producing statistically significant F1 and BLEU improvements.
Team and Individual Credit Assignment: In MAICC (Jiang et al., 13 Nov 2025), hybrid retrieval utility

$S_{\text{util}}(\tau) = \alpha \cdot \operatorname{norm}(R_{\text{team}}(\tau)) + (1-\alpha) \cdot \operatorname{norm}(\tilde{R}_j(\tau))$

ensures both joint task and agent-specific return maximization in trajectory retrieval, supporting efficient decentralized adaptation.

5. Communication Protocols and Distributed Coordination

AMA frameworks employ multi-agent communication, both explicit and implicit, to coordinate context sharing and distributed memory updates.

Structured Communication Protocols: DAMCS (Yang et al., 8 Feb 2025) defines message schemas and schedules communication along neighbor chains, filtering message content for task relevance and minimizing bandwidth by only transmitting prioritized facts.
Memory Sharing with Security and Persistence: Secure, persistent context exchange via encrypted key-value stores and fine-grained access control enables cross-session and cross-agent knowledge continuity (SAMEP (Masoor, 5 Jul 2025)), with batch vector search, compliance, and auditability.
Decentralized Optimization and Graph Formation: DeLAMA (Tang et al., 2024) employs decentralized, dual-ascent graph learning and distributed parameter fusion, with each agent independently updating finite memory and graph weights to discover collaboration strategies and retain lifelong learning capabilities.
Peer-to-Peer Gradient Aggregation: Distributed Associative Memory (Wang et al., 26 Sep 2025) organizes communication as Steiner routing trees, enabling agents to update their local parameters using delayed gradients from selected peers, achieving sublinear regret bounds.

6. Empirical Results, Benchmarks, and Performance Analysis

Extensive benchmark evaluations demonstrate that AMA-based systems advance efficiency, success rates, adaptability, and scalability across diverse multi-agent environments and tasks.

Token and Step Efficiency: MiTa (Zhang et al., 30 Jan 2026) achieves the lowest average steps (34.4 vs. 61.9 for MHP, 39.3 for CoELA, and 43.8 for ProAgent), representing a 68% efficiency improvement in VirtualHome-Social C-WAH tasks.
Robustness and Scalability: Both MiTa (Zhang et al., 30 Jan 2026) and LatentMem (Fu et al., 3 Feb 2026) maintain low overhead and consistent performance when module backbones are downgraded, and are scalable to larger teams.
Reasoning and QA Gains: AMA (Huang et al., 28 Jan 2026) reduces token usage by ≈80% compared to full-context reasoning while achieving higher accuracy/LLM scores on LoCoMo and LongMemEval benchmarks; G-Memory (Zhang et al., 9 Jun 2025) increases embodied task success by up to 20.89% and QA by 10.12%.
Role and Task Adaptation: LatentMem's gains are up to 19.36%, and ablations confirm indispensable benefit from both role awareness and experience update.

Framework	Gains (Reasoning/Task)	Token/Step Savings	Customization
MiTa	+68% EI, fewer steps	Robust to LLM swaps	Hierarchical, episodic-to-global
LatentMem	+19.36% (QA/coding/planning)	50% fewer tokens	Role-aware latent
G-Memory	+20.89% action, +10.12% QA	Modest (<10%) overhead	Insight–query–trajectory graphs
SAMEP	+73% redundant computation cut	+89% context relevance	Secure, persistent

Note: EI = Efficiency Improvement, LLM = LLM.

7. Limitations and Future Directions

Scalability Constraints: Hierarchical memories or episodic summaries can grow unbounded without strategic compaction (MiTa, AMA).
Agent Reliance on Advanced Models: Degrading manager/Refresher/Judge LLMs can sharply impact system coherence and efficiency.
Adaptivity and Learning: Many thresholds (retrieval granularity, weighting) remain static; learning dynamic retrieval and update policies, or integrating reinforcement signal (as in LatentMem/MLC-Agent), represent promising directions.
Security and Privacy: Persistent and cross-agent memories require robust encryption, access control, and compliance protocols (SAMEP), particularly in cross-domain or regulated settings.

Future work aims to: (1) automate adaptive parameter tuning (e.g., retrieval budgets) via meta-optimization, (2) deploy neural abstraction for knowledge diffusion, (3) support multi-modal, distributed, and federated memory with dynamic relevance-based recall, (4) formalize theoretical bounds on memory growth and convergence under open-ended collaboration, and (5) extend adversarial adaptation to multi-modal and streaming contexts (Huang et al., 28 Jan 2026, Deng et al., 29 Jan 2026, Fu et al., 3 Feb 2026, Masoor, 5 Jul 2025).