Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
95 tokens/sec
Gemini 2.5 Pro Premium
52 tokens/sec
GPT-5 Medium
31 tokens/sec
GPT-5 High Premium
22 tokens/sec
GPT-4o
100 tokens/sec
DeepSeek R1 via Azure Premium
98 tokens/sec
GPT OSS 120B via Groq Premium
436 tokens/sec
Kimi K2 via Groq Premium
209 tokens/sec
2000 character limit reached

MIRIX: Multi-Agent Memory System for LLM-Based Agents (2507.07957v1)

Published 10 Jul 2025 in cs.CL and cs.AI

Abstract: Although memory capabilities of AI agents are gaining increasing attention, existing solutions remain fundamentally limited. Most rely on flat, narrowly scoped memory components, constraining their ability to personalize, abstract, and reliably recall user-specific information over time. To this end, we introduce MIRIX, a modular, multi-agent memory system that redefines the future of AI memory by solving the field's most critical challenge: enabling LLMs to truly remember. Unlike prior approaches, MIRIX transcends text to embrace rich visual and multimodal experiences, making memory genuinely useful in real-world scenarios. MIRIX consists of six distinct, carefully structured memory types: Core, Episodic, Semantic, Procedural, Resource Memory, and Knowledge Vault, coupled with a multi-agent framework that dynamically controls and coordinates updates and retrieval. This design enables agents to persist, reason over, and accurately retrieve diverse, long-term user data at scale. We validate MIRIX in two demanding settings. First, on ScreenshotVQA, a challenging multimodal benchmark comprising nearly 20,000 high-resolution computer screenshots per sequence, requiring deep contextual understanding and where no existing memory systems can be applied, MIRIX achieves 35% higher accuracy than the RAG baseline while reducing storage requirements by 99.9%. Second, on LOCOMO, a long-form conversation benchmark with single-modal textual input, MIRIX attains state-of-the-art performance of 85.4%, far surpassing existing baselines. These results show that MIRIX sets a new performance standard for memory-augmented LLM agents. To allow users to experience our memory system, we provide a packaged application powered by MIRIX. It monitors the screen in real time, builds a personalized memory base, and offers intuitive visualization and secure local storage to ensure privacy.

Summary

  • The paper introduces a modular multi-agent memory system with six dedicated components that enhance persistent, context-specific data recall.
  • It implements an active retrieval mechanism using embedding, BM25, and string matching to achieve up to 410% accuracy improvements and 99.9% storage savings.
  • The framework supports real-time, multimodal inputs and proposes a decentralized Agent Memory Marketplace for personalized, edge-integrated AI applications.

MIRIX: A Modular Multi-Agent Memory System for LLM-Based Agents

The MIRIX framework presents a comprehensive, modular memory architecture for LLM-based agents, addressing the persistent limitations of flat, text-centric, and narrowly scoped memory systems. The design is motivated by the need for agents to persist, abstract, and reliably recall user-specific information over extended periods, supporting both multimodal and large-scale real-world scenarios.

System Architecture

MIRIX introduces a six-component memory system, each managed by a dedicated agent and coordinated by a Meta Memory Manager. The components are:

  • Core Memory: Stores persistent, high-priority information about the user and agent persona.
  • Episodic Memory: Captures time-stamped, event-based user experiences.
  • Semantic Memory: Maintains abstract, factual, and relational knowledge.
  • Procedural Memory: Encodes step-by-step instructions and workflows.
  • Resource Memory: Handles documents, files, and multimodal resources.
  • Knowledge Vault: Secures verbatim and sensitive information.

This modularity enables specialized storage, retrieval, and update strategies, supporting efficient routing and compositional reasoning. The multi-agent design allows parallel updates and targeted retrieval, overcoming the bottlenecks of monolithic or flat memory systems.

Active Retrieval Mechanism

A key innovation is the Active Retrieval pipeline. Upon receiving a user query, the system:

  1. Infers the current topic from the input context.
  2. Retrieves relevant entries from each memory component using embedding, BM25, or string matching.
  3. Injects retrieved content, tagged by source, into the system prompt for the LLM.

This mechanism ensures that the agent consistently grounds its responses in up-to-date, user-specific, and contextually relevant information, mitigating the risk of outdated or hallucinated outputs from the LLM's parametric memory.

Multimodal and Real-Time Capabilities

MIRIX is designed to handle rich multimodal input, including high-resolution images and documents. The system supports real-time screen monitoring, deduplication, and streaming uploads, leveraging cloud APIs (e.g., Gemini) for efficient image processing. This enables the construction of a persistent, personalized memory base from continuous user activity, with privacy-preserving local storage options.

Empirical Evaluation

MIRIX is evaluated on two demanding benchmarks:

  • ScreenshotVQA: A multimodal benchmark with up to 20,000 high-resolution screenshots per user sequence. MIRIX achieves a 35% higher accuracy than the RAG baseline and reduces storage requirements by 99.9%. Compared to a long-context Gemini baseline, MIRIX yields a 410% improvement in accuracy with a 93.3% reduction in storage.
  • LOCOMO: A long-form conversational benchmark. MIRIX attains state-of-the-art performance (85.4% accuracy), surpassing the best existing method by 8.0% and closely approaching the full-context upper bound.

These results demonstrate that MIRIX's structured, multi-agent memory architecture enables both superior retrieval accuracy and substantial efficiency gains over prior systems.

Practical Implications

MIRIX is implemented as a cross-platform application (React-Electron frontend, Uvicorn backend), supporting real-time screen monitoring, memory visualization, and secure local storage. The architecture is well-suited for integration into wearable devices, enabling persistent, context-aware memory for AI-powered glasses and pins. The modular design supports hybrid on-device/cloud memory management, addressing the compute and storage constraints of edge devices.

The authors also propose an Agent Memory Marketplace, envisioning personal memory as a digital asset class. This includes privacy-preserving infrastructure, decentralized storage, and peer-to-peer memory sharing, with applications in productivity, expert communities, and digital persona markets.

Theoretical and Future Directions

MIRIX advances the field by operationalizing cognitive science distinctions (episodic, semantic, procedural memory) within a scalable, agentic framework. The system's ability to abstract, consolidate, and reason over heterogeneous, long-term user data sets a new standard for memory-augmented LLM agents.

Future research directions include:

  • Developing more challenging, real-world multimodal benchmarks.
  • Enhancing retrieval strategies and memory compression techniques.
  • Extending the system to support collaborative, multi-user memory and collective intelligence.
  • Investigating privacy, security, and ethical considerations in memory sharing and marketplace scenarios.

Conclusion

MIRIX demonstrates that a modular, multi-agent memory system with active retrieval and multimodal support can substantially improve the long-term reasoning, personalization, and efficiency of LLM-based agents. The strong empirical results and practical deployment strategies highlight the viability of structured memory architectures as a foundation for next-generation AI assistants and cognitive systems.

Authors (2)

Youtube Logo Streamline Icon: https://streamlinehq.com

alphaXiv