SEDM: Scalable Self-Evolving Distributed Memory for Agents (2509.09498v3)

Published 11 Sep 2025 in cs.AI

Abstract: Long-term multi-agent systems inevitably generate vast amounts of trajectories and historical interactions, which makes efficient memory management essential for both performance and scalability. Existing methods typically depend on vector retrieval and hierarchical storage, yet they are prone to noise accumulation, uncontrolled memory expansion, and limited generalization across domains. To address these challenges, we present SEDM, Self-Evolving Distributed Memory, a verifiable and adaptive framework that transforms memory from a passive repository into an active, self-optimizing component. SEDM integrates verifiable write admission based on reproducible replay, a self-scheduling memory controller that dynamically ranks and consolidates entries according to empirical utility, and cross-domain knowledge diffusion that abstracts reusable insights to support transfer across heterogeneous tasks. Evaluations on benchmark datasets demonstrate that SEDM improves reasoning accuracy while reducing token overhead compared with strong memory baselines, and further enables knowledge distilled from fact verification to enhance multi-hop reasoning. The results highlight SEDM as a scalable and sustainable memory mechanism for open-ended multi-agent collaboration. The code will be released in the later stage of this project.

Summary

The paper introduces SEDM, a self-evolving distributed memory system that improves multi-agent task efficiency by employing verifiable write admission and empirical evaluations.
It employs a self-scheduling controller that optimizes memory retrieval using semantic similarity and weight adjustments based on empirical metrics.
Experimental results on FEVER and HotpotQA benchmarks demonstrate enhanced accuracy and reduced token overhead, confirming its scalability and adaptability.

SEDM: Scalable Self-Evolving Distributed Memory for Agents

SEDM is proposed as a sophisticated memory management system for long-term, open-ended multi-agent systems. This system seeks to overcome conventional memory management hurdles, such as noise accumulation, uncontrolled memory expansion, and limited domain generalization, by transitioning memory from static storage to an active, adaptable component.

Introduction

Large-scale multi-agent systems (MAS) are extensively used in areas such as collaborative reasoning and autonomous planning. The ability for agents to efficiently manage historical interactions and trajectories is crucial for effective long-term collaboration. Current memory methods often rely on hierarchical storage or vector retrieval, which are susceptible to degradation due to noise and exponential memory growth. SEDM introduces an innovative solution by transforming memory into a verifiable and self-optimizing framework.

SEDM Framework

Figure 1: Illustration of different memory strategies. No Memory: the agent interacts with the environment without retaining past information. Fixed Memory: the agent retrieves from a static memory pool, which may grow excessively. SEDM: introduces verifiable write admission, parallel simulation, and adaptive scheduling to build high-quality, self-evolving memory that supports efficient and transferable knowledge use.

SEDM integrates several key components:

SCEC-based Write Admission

Memory items are admitted through the Self-Contained Execution Context (SCEC), ensuring multi-agent systems can perform environment-free evaluations. The verifiable write admission mechanism ensures only high-quality experiences are added to memory. The admission process involves creating a candidate memory item from each SCEC execution, followed by an empirical validation using A/B testing to assess its impact on reward, latency, and token consumption. This approach guarantees that each admitted memory entry provides tangible benefits, verified against a preset threshold.

Self-Scheduling Controller

Figure 2: SEDM architecture. Left: task execution generates traces that are packaged into a Self-Contained Execution Context (SCEC) with inputs, outputs, tool summaries, seeds, and hashes. Bottom: from each SCEC, a candidate memory is extracted and evaluated via paired A/B replay (Original vs. Injected); distributed verification computes $\Delta$ Reward, $\Delta$ Latency, and $\Delta$ Tokens, and an admission gate accepts the item and assigns its initial weight if the score is positive, else discards it. Right: the memory controller performs (a) memory scheduling using $s(q,m)=\operatorname{sim}(q,m)\times w(m)$ for retrieval and injection.

The self-scheduling memory controller refines the repository, managing memory retrievals using weights derived from empirical validations. It uses semantic similarity combined with these weights to optimize memory injection into the current task context, reducing retrieval noise and improving decision quality.

Cross-Domain Knowledge Diffusion

SEDM leverages cross-domain knowledge diffusion to abstract reusable insights, allowing distilled knowledge to traverse and adapt across heterogeneous tasks. By framing memory entries as portable assets, it facilitates testing of memory efficacy across diverse domains, reducing runtime complexity and avoiding cold starts.

Experimental Evaluation

SEDM was evaluated against strong memory baselines such as G-Memory across benchmarks like FEVER and HotpotQA. The experimental results show that SEDM consistently improves accuracy while managing token overhead effectively, offering superior temporal reasoning capabilities and efficient management of large-scale multi-agent interactions.

Results on Specific Benchmarks

FEVER: SEDM achieved the highest task accuracy, significantly outperforming the no-memory baseline, with fewer tokens consumed compared to G-Memory.
HotpotQA: SEDM demonstrated robustness in multi-hop reasoning tasks, showing improved scores and efficient memory handling.

These findings confirm SEDM's scalable and adaptive memory design, contributing to reduced computational demands and enhanced long-term MAS performance.

Conclusion

SEDM represents a significant advancement in MAS memory management, offering a scalable, self-improving framework that maximizes reasoning accuracy while minimizing operational overhead. The experimental results validate SEDM's adaptability and efficiency across various tasks, asserting its potential as a foundational element for future MAS deployment in diverse and complex environments.