Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash 92 tok/s
Gemini 2.5 Pro 59 tok/s Pro
GPT-5 Medium 22 tok/s
GPT-5 High 29 tok/s Pro
GPT-4o 94 tok/s
GPT OSS 120B 471 tok/s Pro
Kimi K2 212 tok/s Pro
2000 character limit reached

MemoryOS: Adaptive Hierarchical Memory Systems

Updated 4 September 2025
  • MemoryOS is a programmable memory management paradigm that integrates hardware-software co-design and lifecycle-aware controls across heterogeneous memory layers.
  • It leverages innovations like page overlays, in-DRAM primitives, and Dirty-Block Index for fine-grained, energy-efficient memory operations and data migration.
  • MemoryOS supports AI applications by enabling adaptable memory hierarchies that optimize performance in long-term, personalized, and continual learning systems.

MemoryOS designates a set of principles, abstractions, and system architectures that elevate memory management from a low-level, under-specialized substrate to a directly visible and efficiently manipulable system resource. Initiated by foundational work in computer architecture and operating system research, “MemoryOS” has come to refer to both (i) hardware/software co-designs that augment traditional memory systems and (ii) emerging LLM/AI-centric architectures in which memory—of various forms and lifecycles—is managed programmably and governed with life-cycle awareness. Across its incarnations, MemoryOS is not a standalone operating system but a highly integrated substrate (in hardware, software, or both) that enables fine-grained, heterogeneity-aware, and performance-conscious management of hierarchical and heterogeneous memory resources.

1. Foundational Principles and Early Hardware-Software Co-Design

Initial MemoryOS concepts arose from the need to address intrinsic inefficiencies in classical DRAM and virtual memory systems, particularly the mismatch in granularity between OS-level page management and hardware-level cache/row semantics (Seshadri, 2016). The principal innovations are:

  • Page Overlays: Fine-grained overlays within virtual pages enable the OS to track and version modifications at the cache-line granularity, reducing redundant memory operations in copy-on-write, deduplication, and fork/clone scenarios.
  • In-DRAM Primitives: Mechanisms such as RowClone facilitate bulk copy/zero operations entirely within DRAM by exploiting row activation properties, while Buddy RAM leverages the analog behavior of sense amplifiers for bulk bitwise operations. Gather-Scatter DRAM extends efficiency to irregular memory access patterns.
  • Dirty-Block Index (DBI): Separately indexed dirty state per DRAM row or region enables efficient bulk cache coherence and writeback, essential for close-to-DRAM data movement.

This approach unifies hardware-exposed memory semantics (e.g., new DRAM operations) with OS-managed metadata, creating an “operating-system–aware” memory substrate—hence the term MemoryOS.

2. Hierarchical and Heterogeneous Memory Management

MemoryOS is designed for systems with DRAM, non-volatile memory (NVM), and storage-class memories present simultaneously, necessitating vertical and horizontal integration:

  • Multi-Channel, Multi-Tier Coordination: Memos, for instance, introduces full-hierarchy memory management spanning caches, memory channels, banks, and heterogeneous media (DRAM/NVM). It leverages kernel monitoring (SysMon), predictive migration engines, and page-coloring techniques to map hot/write-intensive pages to DRAM and cold/read-dominant pages to NVM (Liu et al., 2017).
  • Hardware Management Units: In mobile systems, a hardware-accelerated memory manager (HMMU) transparently manages DRAM and NVM in a flat address space, employing counter-based page replacement, bloom-filter-guided block management, and adaptive sub-page caching. This hardware solution allows the system to approach 88% of all-DRAM performance with up to 39% less energy consumption (Wen et al., 2020).
  • Persistent Memory and Storage-Class Integration: HAMS merges byte-addressable NVDIMM and low-latency flash into a single, OS-transparent “MoS” (Memory-over-Storage) space, residing in the memory controller hub and exposing DRAM-like access semantics with on-hardware migration, cache, and address mapping. Advanced HAMS further reduces energy and latency by direct DDR4 interface to SSDs (Zhang et al., 2021).

These designs demonstrate that MemoryOS unifies memory layers with policy-driven placement, migration, and data structure adaptation, transcending hard architectural boundaries.

3. Lifecycle Management, Memory Abstraction, and Governance

As hybrid and persistent memories proliferate, MemoryOS must address challenges of latency, fragmentation, consistency, and endurance:

  • OS Memory Manager Overhauls: Classical buddy allocation, page-zeroing, and fragmentation heuristics, designed for DRAM, are inefficient on NVMMs. For terabyte-scale systems, MemoryOS must aggressively avoid in-place zeroing and enable process-tied page reuse. Compaction and migration heuristics must be media- and NUMA-aware, as remote page allocation latency can be up to 300× higher for large pages in NVMM compared to DRAM (Garg et al., 2023).
  • Lifecycle and Provenance of Knowledge: AI-centric MemoryOS designs, as deployed for LLMs, introduce MemCube abstractions that encapsulate memory content, type (plaintext, activation, parametric), provenance, and versioning. Memory objects can be composed, migrated, or fused to bridge retrieval-based and parameter-based learning, enabling controllability and evolution of memory representations (Li et al., 4 Jul 2025, Li et al., 28 May 2025).

Table: Core MemoryOS Abstractions in AI/LLMs

Abstraction Description Functionality
MemCube Standardized memory unit with metadata Encapsulates, versions, and migrates
Non-parametric mem External, continually updatable memory resource Efficient, catastrophic-forgetting-free continual learning
Hierarchical memory STM, MTM, LPM (AI agent context) Temporal partition, cohorting, persona/context tracking

4. MemoryOS in AI Agents and LLMs

Recent advances extend MemoryOS from the field of system software/hardware into the field of AI agent architectures, motivated by the limitations of fixed context windows and purely parametric knowledge.

  • Hierarchical Memory Structures: MemoryOS, as implemented for AI agents, segments memory into short-term (STM: real-time dialogue, chain-linked), mid-term (MTM: topic-segmented, similarity-grouped, heat-scored), and long-term (LPM: persistent user/agent traits) stores. Updates employ FIFO (STM→MTM) and “heat”-scored promotion (MTM→LPM), supporting dynamic persona construction and personalized, coherent dialogue over thousands of conversational turns (Kang et al., 30 May 2025).
  • Experimental Validation: On the LoCoMo benchmark, this MemoryOS shows average F1 improvements of 49.11% and BLEU-1 improvements of 46.18% over GPT-4o-mini-based baselines in long conversation retention, with reduced LLM calls and token consumption.
  • Governance and Scheduling: AI-centric MemoryOS architectures embed explicit lifecycle management, permitting creation, consolidation, migration, and selective forgetting of memory units. This bridges stateless retrieval augmentation (RAG) and persistent, evolving knowledge bases (Li et al., 4 Jul 2025, Li et al., 28 May 2025).

5. Continual Learning and Non-Parametric Memory Integration

MemoryOS is pivotal for continual learning, enabling models to adapt to evolving knowledge without catastrophic forgetting or costly parameter retraining:

  • Non-Parametric Continual Memory: MemoryOS for LLMs deploys external, continually updated non-parametric memory, queried during generation. Updates occur in the memory store, not the model parameters, allowing for stable-plasticity trade-off mitigation (Li et al., 4 Jul 2025, Li et al., 28 May 2025).
  • Bridging Retrieval and Learning: This non-parametric memory is lifecycle- and provenance-managed, facilitating updatable and consistent knowledge integration across multiple timescales and agents. A plausible implication is the emergence of persistent, multi-agent coordination and cross-modal learning systems based on MemoryOS.

6. Impact, Applications, and Future Directions

MemoryOS systems are foundational for diverse application domains:

  • High-Performance/Portable Systems: By unifying DRAM, NVM, and storage-class memory, MemoryOS enables efficient database management, analytics, transaction processing, and HPC—achieving DRAM-like performance with superior energy and endurance profiles (Zhang et al., 2021, Wen et al., 2020).
  • Long-Term and Personalized AI Agents: Hierarchically managed memory with explicit updating policies enables AI agents to maintain continuity, adapt to evolving user states, and deliver consistent, personalized experiences across sessions (Kang et al., 30 May 2025).
  • Research Trajectory: Future MemoryOS research is geared towards NVMM-aware allocation policies, explicit tier-aware OS policies, AI agent memory governance, and dynamic adaption to workload/interference patterns. The explicit separation of parametric and non-parametric memories, with structured lifecycle control, is likely to accelerate robust, scalable continual learning and cross-platform coordination in future intelligent systems (Garg et al., 2023, Li et al., 4 Jul 2025).

7. Summary and Significance

MemoryOS represents a paradigm shift from opaque, homogeneous memory management to explicit, programmable, and lifecycle-governed memory substrate design. From enabling efficient DRAM/NVM/flash hardware integration to supporting long-term, structured memory in AI agents, MemoryOS approaches bridge the gap between hardware capabilities, OS resource management, and evolving requirements of modern AI/LLM workloads. The integration of hierarchical storage, fine-grained tracking, provenance-managed memory units, and cross-modal updating lays the groundwork for the next generation of adaptive and efficient computing systems.