Memory OS of AI Agent

Published 30 May 2025 in cs.AI | (2506.06326v1)

Abstract: LLMs face a crucial challenge from fixed context windows and inadequate memory management, leading to a severe shortage of long-term memory capabilities and limited personalization in the interactive experience with AI agents. To overcome this challenge, we innovatively propose a Memory Operating System, i.e., MemoryOS, to achieve comprehensive and efficient memory management for AI agents. Inspired by the memory management principles in operating systems, MemoryOS designs a hierarchical storage architecture and consists of four key modules: Memory Storage, Updating, Retrieval, and Generation. Specifically, the architecture comprises three levels of storage units: short-term memory, mid-term memory, and long-term personal memory. Key operations within MemoryOS include dynamic updates between storage units: short-term to mid-term updates follow a dialogue-chain-based FIFO principle, while mid-term to long-term updates use a segmented page organization strategy. Our pioneering MemoryOS enables hierarchical memory integration and dynamic updating. Extensive experiments on the LoCoMo benchmark show an average improvement of 49.11% on F1 and 46.18% on BLEU-1 over the baselines on GPT-4o-mini, showing contextual coherence and personalized memory retention in long conversations. The implementation code is open-sourced at https://github.com/BAI-LAB/MemoryOS.

Abstract PDF Upgrade to Chat

Summary

The paper introduces an OS-inspired hierarchical memory operating system that enhances long-term dialogue coherence and personalized response generation.
The paper outlines a segmented memory model with three tiers (STM, MTM, and LPM) and a heat-based eviction mechanism for dynamic memory updating.
The paper demonstrates superior results with up to 49% improvement in F1 score and reduced token consumption compared to existing models.

MemoryOS: A Hierarchical Memory Operating System for Long-Term AI Agent Personalization and Coherence

Motivation and Background

LLMs demonstrate strong performance in text comprehension and generation but suffer from severe limitations in memory management. The fixed-context window design precludes sustained coherence in long-term multi-session interactions, resulting in fragmented memory, factual inconsistency, and diminished personalization. Existing approaches—including knowledge-organization (e.g., A-Mem), retrieval mechanism-oriented (e.g., MemoryBank), and architecture-driven methods (e.g., MemGPT)—tend to operate in isolation and lack a unified, comprehensive framework for memory organization, retrieval, and updating.

The "Memory OS of AI Agent" (2506.06326) proposes an integrated memory management system, MemoryOS, inspired by classical operating systems (OS) segment-page memory management, to fill this gap. MemoryOS aims to systematically manage memory for AI agents, achieving hierarchical storage, dynamic updating, adaptive retrieval, and personalized generation.

MemoryOS Architecture and Modules

MemoryOS introduces a four-module hierarchical memory management paradigm:

Memory Storage: Implements a three-tier hierarchy—Short-Term Memory (STM), Mid-Term Memory (MTM), and Long-term Persona Memory (LPM). STM maintains a FIFO queue of recent dialogue pages linked by dialogue chains for context tracking; MTM uses segmented paging to organize topic-coherent dialogue segments, each with multiple pages; LPM stores robust user/agent persona, traits, and user knowledge bases.
Memory Updating: Handles both intra-unit and inter-tier transitions. STM-to-MTM uses FIFO chain transfer upon queue overflow; MTM-to-LPM employs a heat-based metric (retrieval count, interaction length, recency with exponential decay) to prioritize segment migration and eviction.
Memory Retrieval: Implements multi-source retrieval. STM returns recent context; MTM uses a two-stage retrieval (segment selection via semantic/keyword match, page selection via similarity score); LPM retrieves top-k entries for persona and factual alignment. All retrieved memories are assembled to inform response generation.
Response Generation: Constructs final LLM prompts by integrating recent context, relevant historical pages, and personalized traits, ensuring conversational coherence, depth, and personalized responses.

Methodological Advances

The hierarchical organization and OS-inspired segmented paging in MTM enable efficient context consolidation, topic maintenance, and memory scalability. Heat-based eviction balances recency and engagement, allowing dynamic retention of important conversational content.

Persona integration in LPM aggregates both static attributes (profiles) and evolving interests/preferences, supporting persistent adaptation and consistent agent identity. The methodology is generic—applicable to diverse LLM-based agents—and robust against context fragmentation or excessive memory noise.

Empirical Evaluation

MemoryOS was extensively benchmarked on GVD and LoCoMo datasets—where LoCoMo quantifies ultra-long-term conversational memory (300+ turns, 9k tokens). Evaluation metrics include memory retrieval accuracy, response correctness, contextual coherence, F1, and BLEU-1.

Key empirical findings:

MemoryOS surpasses previous SOTA (e.g., A-Mem, MemGPT) with average improvements of 49.11% (F1) and 46.18% (BLEU-1) in LoCoMo (GPT-4o-mini), indicating robust context retention and persona adherence in extremely long conversations.
On GVD, MemoryOS also outperforms baselines, achieving 3.2% higher accuracy and 5.4% higher response correctness.
Efficiency analysis reveals MemoryOS requires fewer LLM calls (4.9 vs. 13 for A-Mem*) and less token consumption compared to MemGPT, validating its scalability and practical viability.
Ablation studies demonstrate the critical importance of MTM and LPM: removal of either reduces performance significantly, confirming effectiveness of hierarchical storage and persona modules.

Strong Claims and Contradictory Evidence

The paper makes bold claims of introducing the first comprehensive OS-inspired memory management for AI agents, and provides empirical evidence that isolated memory decay (e.g., MemoryBank) or flat architectures (e.g., MemGPT) are insufficient for sustained long-term interaction. Furthermore, the authors contradict prior approaches by demonstrating that multi-modal/chained memory integration (as in MemoryOS) is superior on both correctness and efficiency metrics.

Practical and Theoretical Implications

The practical implications of MemoryOS are significant for deployment of personalized AI agents in realistic, long-running conversational scenarios. Efficient hierarchical memory management and persona adaptation facilitate domain transfer, persistent user relationships, and improved user experience. Theoretically, MemoryOS bridges classical OS principles with AI memory architecture, suggesting that logical segment-page abstraction and heat-based prioritization are well-suited for dynamic conversational memory management in LLM agents.

Future Directions

Future research can build upon MemoryOS by exploring:

Expansion to multi-modal memory management (vision, knowledge graphs) for richer context representation.
Adaptive scaling mechanisms using reinforcement learning for dynamic memory tier sizing and trait evolution.
Integration with production-ready agent frameworks (e.g., Mem0 (Chhikara et al., 28 Apr 2025)) for robust real-world deployment.
Advanced retrieval strategies combining emotional state or intent inference, further enhancing personalization and consistency.

Conclusion

MemoryOS provides a systematic, OS-inspired hierarchical memory operating system for AI agents, achieving superior performance in dialogue coherence, retrieval accuracy, and personalization across long-term interactions (2506.06326). Its architectural innovations and empirical results highlight the value of segmented paging, heat-based eviction, and persona-centric memory modules for sustained agent competence. MemoryOS sets a paradigm for future research in memory-augmented LLMs, enabling scalable and coherent user-agent dialogue in practical settings.

Markdown

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Paper Prompts

Top Community Prompts

Explain it Like I'm 14

off on

Knowledge Gaps

off on

Practical Applications

off on

Glossary

off on

Conceptual Simplification

off on

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Generate Now

Continue Learning

Authors (4)

Collections

GitHub

GitHub - BAI-LAB/MemoryOS (326 stars)

Memory OS of AI Agent

Summary

MemoryOS: A Hierarchical Memory Operating System for Long-Term AI Agent Personalization and Coherence

Motivation and Background

MemoryOS Architecture and Modules

Methodological Advances

Empirical Evaluation

Strong Claims and Contradictory Evidence

Practical and Theoretical Implications

Future Directions

Conclusion

Paper to Video (Beta)

Whiteboard

Paper Prompts

Top Community Prompts

Open Problems

Continue Learning

Related Papers

Authors (4)

Collections

GitHub

Tweets