- The paper introduces SEDM, a self-evolving distributed memory system that improves multi-agent task efficiency by employing verifiable write admission and empirical evaluations.
- It employs a self-scheduling controller that optimizes memory retrieval using semantic similarity and weight adjustments based on empirical metrics.
- Experimental results on FEVER and HotpotQA benchmarks demonstrate enhanced accuracy and reduced token overhead, confirming its scalability and adaptability.
SEDM: Scalable Self-Evolving Distributed Memory for Agents
SEDM is proposed as a sophisticated memory management system for long-term, open-ended multi-agent systems. This system seeks to overcome conventional memory management hurdles, such as noise accumulation, uncontrolled memory expansion, and limited domain generalization, by transitioning memory from static storage to an active, adaptable component.
Introduction
Large-scale multi-agent systems (MAS) are extensively used in areas such as collaborative reasoning and autonomous planning. The ability for agents to efficiently manage historical interactions and trajectories is crucial for effective long-term collaboration. Current memory methods often rely on hierarchical storage or vector retrieval, which are susceptible to degradation due to noise and exponential memory growth. SEDM introduces an innovative solution by transforming memory into a verifiable and self-optimizing framework.
SEDM Framework
Figure 1: Illustration of different memory strategies. No Memory: the agent interacts with the environment without retaining past information. Fixed Memory: the agent retrieves from a static memory pool, which may grow excessively. SEDM: introduces verifiable write admission, parallel simulation, and adaptive scheduling to build high-quality, self-evolving memory that supports efficient and transferable knowledge use.
SEDM integrates several key components:
SCEC-based Write Admission
Memory items are admitted through the Self-Contained Execution Context (SCEC), ensuring multi-agent systems can perform environment-free evaluations. The verifiable write admission mechanism ensures only high-quality experiences are added to memory. The admission process involves creating a candidate memory item from each SCEC execution, followed by an empirical validation using A/B testing to assess its impact on reward, latency, and token consumption. This approach guarantees that each admitted memory entry provides tangible benefits, verified against a preset threshold.
Self-Scheduling Controller
Figure 2: SEDM architecture. Left: task execution generates traces that are packaged into a Self-Contained Execution Context (SCEC) with inputs, outputs, tool summaries, seeds, and hashes. Bottom: from each SCEC, a candidate memory is extracted and evaluated via paired A/B replay (Original vs. Injected); distributed verification computes ΔReward, ΔLatency, and ΔTokens, and an admission gate accepts the item and assigns its initial weight if the score is positive, else discards it. Right: the memory controller performs (a) memory scheduling using s(q,m)=sim(q,m)×w(m) for retrieval and injection.
The self-scheduling memory controller refines the repository, managing memory retrievals using weights derived from empirical validations. It uses semantic similarity combined with these weights to optimize memory injection into the current task context, reducing retrieval noise and improving decision quality.
Cross-Domain Knowledge Diffusion
SEDM leverages cross-domain knowledge diffusion to abstract reusable insights, allowing distilled knowledge to traverse and adapt across heterogeneous tasks. By framing memory entries as portable assets, it facilitates testing of memory efficacy across diverse domains, reducing runtime complexity and avoiding cold starts.
Experimental Evaluation
SEDM was evaluated against strong memory baselines such as G-Memory across benchmarks like FEVER and HotpotQA. The experimental results show that SEDM consistently improves accuracy while managing token overhead effectively, offering superior temporal reasoning capabilities and efficient management of large-scale multi-agent interactions.
Results on Specific Benchmarks
- FEVER: SEDM achieved the highest task accuracy, significantly outperforming the no-memory baseline, with fewer tokens consumed compared to G-Memory.
- HotpotQA: SEDM demonstrated robustness in multi-hop reasoning tasks, showing improved scores and efficient memory handling.
These findings confirm SEDM's scalable and adaptive memory design, contributing to reduced computational demands and enhanced long-term MAS performance.
Conclusion
SEDM represents a significant advancement in MAS memory management, offering a scalable, self-improving framework that maximizes reasoning accuracy while minimizing operational overhead. The experimental results validate SEDM's adaptability and efficiency across various tasks, asserting its potential as a foundational element for future MAS deployment in diverse and complex environments.