Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 69 tok/s
Gemini 2.5 Pro 53 tok/s Pro
GPT-5 Medium 42 tok/s Pro
GPT-5 High 41 tok/s Pro
GPT-4o 120 tok/s Pro
Kimi K2 191 tok/s Pro
GPT OSS 120B 459 tok/s Pro
Claude Sonnet 4.5 36 tok/s Pro
2000 character limit reached

MemGen: Weaving Generative Latent Memory for Self-Evolving Agents (2509.24704v1)

Published 29 Sep 2025 in cs.CL

Abstract: Agent memory shapes how LLM-powered agents, akin to the human brain, progressively refine themselves through environment interactions. Existing paradigms remain constrained: parametric memory forcibly adjusts model parameters, and retrieval-based memory externalizes experience into structured databases, yet neither captures the fluid interweaving of reasoning and memory that underlies human cognition. To address this gap, we propose MemGen, a dynamic generative memory framework that equips agents with a human-esque cognitive faculty. It consists of a \textit{memory trigger}, which monitors the agent's reasoning state to decide explicit memory invocation, and a \textit{memory weaver}, which takes the agent's current state as stimulus to construct a latent token sequence as machine-native memory to enrich its reasoning. In this way, MemGen enables agents to recall and augment latent memory throughout reasoning, producing a tightly interwoven cycle of memory and cognition. Extensive experiments across eight benchmarks show that MemGen surpasses leading external memory systems such as ExpeL and AWM by up to $38.22\%$, exceeds GRPO by up to $13.44\%$, and exhibits strong cross-domain generalization ability. More importantly, we find that without explicit supervision, MemGen spontaneously evolves distinct human-like memory faculties, including planning memory, procedural memory, and working memory, suggesting an emergent trajectory toward more naturalistic forms of machine cognition.

Summary

  • The paper introduces a generative memory model that interweaves reasoning with latent memory, outperforming benchmarks by up to 38.22%.
  • The methodology uses a memory trigger via reinforcement learning and a memory weaver to generate context-rich latent token sequences.
  • MemGen demonstrates robust cross-domain generalization and mitigates catastrophic forgetting, paving the way for human-like cognitive AI systems.

MemGen: Weaving Generative Latent Memory for Self-Evolving Agents

Introduction

The paper "MemGen: Weaving Generative Latent Memory for Self-Evolving Agents" proposes a novel memory framework to enhance LLM-powered agents. Current paradigms, such as parametric and retrieval-based memory, are restricted in imitating human-like cognition due to their inability to fluidly interweave reasoning and memory. MemGen introduces a highly dynamic generative memory model that mimics the human cognitive cycle, involving a "memory trigger" for detecting reasoning states that require memory recall and a "memory weaver" that constructs latent token sequences to enrich cognitive processing. Experimental results demonstrate MemGen's superiority, outperforming existing memory systems by significant margins and displaying emergent cognitive traits like planning and procedural memory.

Methodology

MemGen combines two main components: the memory trigger and the memory weaver. The memory trigger utilizes reinforcement learning to discern when memory should be invoked, relying on hidden state vectors generated during reasoning to estimate the likelihood of beneficial memory integration. Conversely, the memory weaver dynamically generates latent memory sequences when invoked, synthesizing external data with internal cognitive states for optimal problem-solving. Figure 1

Figure 1: Overview of the proposed MemGen framework, illustrating interaction between memory trigger and weaver.

Experimental Results

MemGen significantly outperformed various benchmarks, such as the parametric and retrieval-based memory models. Across nine benchmarks, MemGen yielded up to a 38.22% improvement over the state-of-the-art memory systems. Additionally, it demonstrated robust cross-domain generalization and bolstered continual learning capabilities, reducing the catastrophic forgetting typically seen in parametric models. MemGen's ability to spontaneously develop human-like memory categories further showcases its potential in mimicking complex cognitive behaviors.

Analysis of Memory and Reasoning Interleaving

A core aspect of MemGen's design is the seamless interleaving of memory and reasoning, which effectively bypasses the static reliance seen in retrieval-based memory systems. As shown in Figure 2, t-SNE visualizations indicate clear clustering of similar memory types, suggesting that MemGen is proficient in organizing and utilizing memory in a meticulous, segmented manner akin to human cognition. Figure 2

Figure 2: t-SNE visualization of latent memories generated by MemGen across different datasets.

Implications and Future Developments

The implication of MemGen's research is profound, indicating actionable pathways towards developing machines capable of fluid, context-aware reasoning reminiscent of human thought processes. As AI agents progressively evolve with these advanced memory paradigms, the potential applications span across complex decision-making environments, from autonomous vehicles to personalized healthcare systems. Future work could explore more granular integration of memory types and scaling mechanisms to expand the applicability of MemGen.

Conclusion

MemGen introduces a comprehensive and dynamic architecture for integrating memory into AI systems, overcoming the limitations of traditional paradigms by enabling seamless reasoning and recall cycles. Through extensive experimentation and analysis, MemGen proves to be a significant advancement in building AI agents that emulate human-like cognitive processes, exhibiting emergent memory faculties and adaptability across diverse tasks.

Ai Generate Text Spark Streamline Icon: https://streamlinehq.com

Explain it Like I'm 14

Overview

This paper introduces MemGen, a new way to give AI agents a “memory” that works more like a human’s. Instead of just stuffing facts into the AI or copying notes from a database, MemGen lets the AI create and use its own short, internal “machine notes” while it is thinking. This helps the AI learn from experience, make better decisions, and avoid forgetting old knowledge.

What questions does the paper ask?

The authors focus on a simple idea: how can an AI’s memory work smoothly with its thinking, like a real mind? They ask:

  • Can an AI decide for itself when it needs to “remember” something during problem-solving?
  • Can it generate helpful internal memory (not just retrieve old text) to guide its next steps?
  • Can this memory help across different tasks (like math, coding, science, and web search) without causing the AI to forget other things?

How does MemGen work?

MemGen adds two small “helpers” around a regular LLM (the main AI stays frozen and unchanged). Think of the main AI as a student, and the helpers as tools that improve paper habits.

  • First, why not just use standard methods?
    • Parametric memory: Fine-tuning a model’s parameters is like rewriting the student’s brain. It can help but may cause “catastrophic forgetting” (losing general knowledge).
    • Retrieval memory: Grabbing old notes from a database is like pasting extra text into the homework. It’s helpful, but clunky and not truly part of the AI’s thinking.

MemGen’s idea: let the AI generate short, machine-only memory on the fly that blends into its reasoning.

The two key parts

  • Memory Trigger (when to recall)
    • What it does: Watches the AI’s ongoing thoughts as it writes each sentence. When it senses the AI is stuck or about to make a mistake, it decides to “invoke memory.”
    • How it learns: Using reinforcement learning (RL), a trial-and-error method where the trigger gets rewarded when its decision leads to better results. It also learns not to trigger too often.
  • Memory Weaver (what to recall)
    • What it does: When triggered, it creates a short sequence of special “latent tokens” (think of them as tiny, machine-only note cards) based on the AI’s current state. These notes are not human-readable, but the AI understands them.
    • How it helps: The AI then continues thinking with these new notes woven into its internal process, improving planning, accuracy, and consistency.
    • Extra: The weaver can also combine info from outside sources (like a database) but turns it into those compact, machine-native notes so it fits smoothly into the AI’s thinking.

Because the main AI stays frozen, MemGen avoids catastrophic forgetting. Only the small helper parts are trained.

What did they find?

Across many benchmarks (math, coding, science questions, web trivia, and a game-like environment where the AI performs actions), MemGen showed strong results.

Here are the key takeaways:

  • Better performance: MemGen beat popular memory systems (like retrieval-based ones) by as much as about 38%, and even outperformed a strong RL method (GRPO) by up to about 13% on some tasks.
  • Works across domains: Training MemGen on one area (like math) also helped in others (like science and coding). The trigger learned to invoke memory more in tasks where it helped and less where it didn’t—so it didn’t “force” memory where it wasn’t useful.
  • Less forgetting: Because the main AI is not rewritten, it kept old skills while learning new ones. This is important for long-term learning.
  • Human-like memory roles emerged: By studying which “latent notes” mattered for which mistakes, the authors found three natural types of machine memory:
    • Planning memory: helps with overall strategy and step ordering.
    • Procedural memory: helps with doing things correctly, like using tools and formatting answers.
    • Working memory: helps keep track of what’s going on in a long problem so the AI stays consistent.
  • Efficient: The extra thinking time stayed small; the system did not significantly slow the AI down.

Why is this important?

MemGen points toward AI that learns more like people do: thinking and remembering at the same time, and creating helpful “mental notes” on the spot. This can make AI assistants:

  • more reliable (they plan better and make fewer careless mistakes),
  • more adaptable (they improve in one area without hurting others),
  • and safer to update (no need to rewrite the whole model and risk forgetting).

In practical terms, this could improve tutoring bots that solve math step-by-step, coding assistants that remember how to use tools, research helpers that keep context straight across long tasks, and more. It’s a step toward self-improving AI agents with memory that feels less like a bolted-on database and more like a real, flexible mind.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 2 posts and received 3 likes.