- The paper introduces a modular multi-agent memory system with six dedicated components that enhance persistent, context-specific data recall.
- It implements an active retrieval mechanism using embedding, BM25, and string matching to achieve up to 410% accuracy improvements and 99.9% storage savings.
- The framework supports real-time, multimodal inputs and proposes a decentralized Agent Memory Marketplace for personalized, edge-integrated AI applications.
MIRIX: A Modular Multi-Agent Memory System for LLM-Based Agents
The MIRIX framework presents a comprehensive, modular memory architecture for LLM-based agents, addressing the persistent limitations of flat, text-centric, and narrowly scoped memory systems. The design is motivated by the need for agents to persist, abstract, and reliably recall user-specific information over extended periods, supporting both multimodal and large-scale real-world scenarios.
System Architecture
MIRIX introduces a six-component memory system, each managed by a dedicated agent and coordinated by a Meta Memory Manager. The components are:
- Core Memory: Stores persistent, high-priority information about the user and agent persona.
- Episodic Memory: Captures time-stamped, event-based user experiences.
- Semantic Memory: Maintains abstract, factual, and relational knowledge.
- Procedural Memory: Encodes step-by-step instructions and workflows.
- Resource Memory: Handles documents, files, and multimodal resources.
- Knowledge Vault: Secures verbatim and sensitive information.
This modularity enables specialized storage, retrieval, and update strategies, supporting efficient routing and compositional reasoning. The multi-agent design allows parallel updates and targeted retrieval, overcoming the bottlenecks of monolithic or flat memory systems.
Active Retrieval Mechanism
A key innovation is the Active Retrieval pipeline. Upon receiving a user query, the system:
- Infers the current topic from the input context.
- Retrieves relevant entries from each memory component using embedding, BM25, or string matching.
- Injects retrieved content, tagged by source, into the system prompt for the LLM.
This mechanism ensures that the agent consistently grounds its responses in up-to-date, user-specific, and contextually relevant information, mitigating the risk of outdated or hallucinated outputs from the LLM's parametric memory.
Multimodal and Real-Time Capabilities
MIRIX is designed to handle rich multimodal input, including high-resolution images and documents. The system supports real-time screen monitoring, deduplication, and streaming uploads, leveraging cloud APIs (e.g., Gemini) for efficient image processing. This enables the construction of a persistent, personalized memory base from continuous user activity, with privacy-preserving local storage options.
Empirical Evaluation
MIRIX is evaluated on two demanding benchmarks:
- ScreenshotVQA: A multimodal benchmark with up to 20,000 high-resolution screenshots per user sequence. MIRIX achieves a 35% higher accuracy than the RAG baseline and reduces storage requirements by 99.9%. Compared to a long-context Gemini baseline, MIRIX yields a 410% improvement in accuracy with a 93.3% reduction in storage.
- LOCOMO: A long-form conversational benchmark. MIRIX attains state-of-the-art performance (85.4% accuracy), surpassing the best existing method by 8.0% and closely approaching the full-context upper bound.
These results demonstrate that MIRIX's structured, multi-agent memory architecture enables both superior retrieval accuracy and substantial efficiency gains over prior systems.
Practical Implications
MIRIX is implemented as a cross-platform application (React-Electron frontend, Uvicorn backend), supporting real-time screen monitoring, memory visualization, and secure local storage. The architecture is well-suited for integration into wearable devices, enabling persistent, context-aware memory for AI-powered glasses and pins. The modular design supports hybrid on-device/cloud memory management, addressing the compute and storage constraints of edge devices.
The authors also propose an Agent Memory Marketplace, envisioning personal memory as a digital asset class. This includes privacy-preserving infrastructure, decentralized storage, and peer-to-peer memory sharing, with applications in productivity, expert communities, and digital persona markets.
Theoretical and Future Directions
MIRIX advances the field by operationalizing cognitive science distinctions (episodic, semantic, procedural memory) within a scalable, agentic framework. The system's ability to abstract, consolidate, and reason over heterogeneous, long-term user data sets a new standard for memory-augmented LLM agents.
Future research directions include:
- Developing more challenging, real-world multimodal benchmarks.
- Enhancing retrieval strategies and memory compression techniques.
- Extending the system to support collaborative, multi-user memory and collective intelligence.
- Investigating privacy, security, and ethical considerations in memory sharing and marketplace scenarios.
Conclusion
MIRIX demonstrates that a modular, multi-agent memory system with active retrieval and multimodal support can substantially improve the long-term reasoning, personalization, and efficiency of LLM-based agents. The strong empirical results and practical deployment strategies highlight the viability of structured memory architectures as a foundation for next-generation AI assistants and cognitive systems.