- The paper demonstrates a modular multi-agent memory system that customizes and persists user data across time and modalities to enhance LLM agent performance.
- It introduces an active retrieval mechanism that autonomously selects optimal methods, ensuring high accuracy and efficiency for context-aware responses.
- The experimental evaluation shows significant gains, with up to 35% higher accuracy over RAG and an 8% improvement on long-form conversation benchmarks.
MIRIX: A Modular Multi-Agent Memory System for LLM-Based Agents
The MIRIX framework introduces a comprehensive, modular memory architecture for LLM-based agents, addressing persistent limitations in current memory-augmented systems. The design is motivated by the need for agents to persist, abstract, and reliably recall user-specific information across time and modalities, a capability that is essential for consistent personalization, long-term reasoning, and real-world usability.
System Architecture
MIRIX is structured around six specialized memory components, each managed by a dedicated agent and coordinated by a central Meta Memory Manager:
- Core Memory: Stores persistent, high-priority information about the user and agent persona.
- Episodic Memory: Captures time-stamped, event-based user experiences.
- Semantic Memory: Maintains abstract, factual, and relational knowledge.
- Procedural Memory: Encodes step-by-step instructions and workflows.
- Resource Memory: Handles documents, files, and multimodal resources.
- Knowledge Vault: Secures verbatim, sensitive information (e.g., credentials, contacts).
This compositional approach enables fine-grained routing and retrieval, supporting both efficient storage and accurate, context-aware recall. Each memory type is internally structured (e.g., episodic entries include event type, summary, details, actor, timestamp), facilitating targeted updates and retrievals.
A multi-agent workflow underpins the system: the Meta Memory Manager orchestrates input processing, memory updates, and retrievals, delegating to specialized Memory Managers. This modularity supports parallelism, scalability, and extensibility, and is particularly well-suited for heterogeneous, multimodal user interactions.
Active Retrieval Mechanism
A key innovation is the Active Retrieval mechanism. Rather than relying on explicit user prompts to trigger memory access, the agent autonomously generates a topic from the input context, retrieves relevant entries from each memory component, and injects them into the system prompt. This ensures that responses are grounded in up-to-date, personalized, and contextually relevant information, mitigating the risk of outdated or incorrect parametric knowledge.
Multiple retrieval strategies are supported (embedding match, BM25, string match), and the agent dynamically selects the most appropriate method based on context. This flexibility enhances retrieval accuracy and efficiency across diverse query types.
Application and Use Cases
MIRIX is implemented as a cross-platform application (React-Electron frontend, Uvicorn backend), with real-time screen monitoring, memory visualization, and a chat interface. The system captures screenshots at 1.5-second intervals, deduplicates similar images, and streams unique screenshots for processing. Visual data is processed via the Gemini API, enabling low-latency, asynchronous uploads and retrievals.
The architecture is also positioned for integration with wearable devices, supporting hybrid on-device/cloud memory management. Critical information can be stored locally for privacy, while large-scale resources are offloaded to the cloud. This design aligns with the constraints and requirements of edge AI applications.
The authors further propose an Agent Memory Marketplace, envisioning personal memory as a digital asset class. The marketplace would enable privacy-preserving sharing, aggregation, and monetization of structured memories, with robust encryption and fine-grained access controls.
Experimental Evaluation
ScreenshotVQA Benchmark
MIRIX is evaluated on ScreenshotVQA, a challenging multimodal benchmark involving up to 20,000 high-resolution screenshots per user sequence. Existing memory systems are inapplicable due to scale and modality constraints. MIRIX achieves:
- 35% higher accuracy than the RAG baseline
- 99.9% reduction in storage requirements compared to RAG
- 410% improvement in accuracy and 93.3% reduction in storage relative to a long-context Gemini baseline
These results are achieved by extracting and storing only salient information in a compact sqlite database, rather than retaining raw images.
LOCOMO Benchmark
On the LOCOMO long-form conversation benchmark, MIRIX attains:
- State-of-the-art overall accuracy of 85.4%
- 8.0% improvement over the strongest open-source competitor (LangMem)
- Performance approaching the upper bound set by full-context models
MIRIX demonstrates particularly strong gains on multi-hop and temporal reasoning tasks, attributed to its hierarchical memory storage and explicit consolidation of dispersed information. The system's modular routing and retrieval are especially effective for long-range, multi-hop queries.
Implications and Future Directions
MIRIX establishes a new standard for memory-augmented LLM agents, both in terms of performance and system design. The modular, multi-agent architecture enables scalable, efficient, and contextually rich memory management, supporting real-world applications that require persistent, multimodal, and personalized memory.
The strong empirical results—especially the order-of-magnitude improvements in storage efficiency and accuracy—underscore the practical viability of structured, compositional memory systems over flat or text-centric approaches. The integration of active retrieval and multi-agent coordination further enhances the system's robustness and adaptability.
The proposed Agent Memory Marketplace introduces a novel paradigm for memory as a digital asset, with significant implications for privacy, personalization, and economic value in AI ecosystems.
Future research directions include:
- Extending MIRIX to more challenging, real-world benchmarks and diverse modalities
- Enhancing retrieval strategies with more sophisticated, context-aware methods
- Exploring lifelong learning, memory consolidation, and forgetting mechanisms
- Investigating the societal and ethical implications of memory sharing and monetization
MIRIX provides a robust foundation for the development of next-generation LLM agents with human-like memory capabilities, supporting persistent, adaptive, and contextually intelligent behavior in complex environments.