MemEngine: Unified Memory Model Library
- MemEngine is a unified, extensible library that standardizes the integration of diverse memory models for autonomous LLM agents, addressing fragmentation across designs.
- It employs a three-tier modular architecture—MemoryFunctions, MemoryOperations, and MemoryModels—to enable systematic benchmarking, dynamic configuration, and seamless plugin integration.
- The framework supports various memory paradigms, including concatenation, embedding-based retrieval, and graph-based self-reflection, using a uniform CRUD-style API.
MemEngine is a unified, extensible library designed to facilitate the development and integration of advanced memory models for LLM-based agents. Launched to address fragmentation in memory design for autonomous agents, MemEngine offers a modular framework where components for encoding, retrieval, summarization, reflection, and memory management are rigorously separated and combinable. It implements all major published agent memory paradigms—ranging from naive history concatenation to graph-based, self-reflective, and optimization-driven schemas—under a single API with a hierarchical configuration system and robust support for plugin development and remote deployment (Zhang et al., 4 May 2025).
1. Modular Architecture and Design Principles
MemEngine enforces a three-tiered software hierarchy for memory management in agent frameworks:
- Level 1: Memory Functions These are atomic computation units (e.g., text encoders, retrievers, summarizers, reflectors, judges, forgetters, LLM query wrappers). Each MemoryFunction receives raw input and produces a specific output—such as an embedding vector, importance score, or summarized text.
- Level 2: Memory Operations MemoryOperations chain multiple MemoryFunctions into compound actions, including Store, Recall, Manage, and Optimize. For example, a Recall operation might encode a query, score its similarity to stored items, and apply a filtering function.
- Level 3: Memory Models MemoryModels compose MemoryOperations ("recipes") to implement published paradigms such as hierarchical retrieval, dynamic summarization, self-reflection, or tree-based indexing.
Configuration is externally specified at all levels via YAML/JSON files or dictionaries, decoupling hyperparameters and prompt templates from code, and supporting reproducibility and easy tuning.
The class hierarchy is summarized below:
| Abstract Base | Canonical Subclass Examples |
|---|---|
| MemoryFunctionBase | Encoder, Retrieval, Summarizer, Judge |
| MemoryOperationBase | StoreOp, RecallOp, ManageOp, OptimizeOp |
| MemoryModelBase | FUMemory, LTMemory, MBMemory, GAMemory |
Utilities such as FastAPI, storage clients (FAISS, Redis, Weaviate), display helpers, and an automatic selector for sweeping hyperparameters are loosely coupled and optional.
2. Canonical Memory Models and Mathematical Operations
MemEngine re-implements nine memories chosen from recent literature, providing comprehensive coverage:
- FUMemory: Full history as flat context (naive concatenation).
- STMemory: Sliding window over the last exchanges.
- LTMemory: Top- embedding-based retrieval; stores all messages, selects those most similar to the current query:
- GAMemory: As in GenerativeAgents, combines importance-weighted retrieval ( computed by Judge) with periodic reflection:
- MBMemory: MemoryBank; applies periodic summarization and forgetting.
- SCMemory: Self-Controlled; recall-minimal.
- MGMemory: MemGPT; hierarchical memory structures.
- RFMemory: Reflexion; learns keep/drop policies via LLM-based OptimizeOp.
- MTMemory: MemTree; tree-based semantic index.
A key feature is the uniform CRUD-style (Create, Read, Update, Delete) API, allowing experiments to swap models without interface changes.
3. API Usage and Integration
MemEngine presents standardized usage workflows in Python or via remote HTTP endpoints:
- Installation:
1 |
pip install memengine |
- Local Example:
1 2 3 4 5 6 7 8 9 10 11 |
from memengine.config import MemoryConfig from memengine.models import LTMemory cfg = MemoryConfig("default_long_term.yaml") cfg.hyperparameters["top_k"] = 5 memory = LTMemory(config=cfg) memory.reset() memory.store("User: What's the capital of France?") memory.store("Assistant: It's Paris, of course.") context = memory.recall(query="I need to mention France capitals again") print("Retrieved snippets:", context) |
- Client Example (Remote):
1 2 3 4 5 |
from memengine.client import MemClient client = MemClient("http://memengine-server:8000") client.init_memory(agent_id="agent42", model="MBMemory") client.store("agent42", "User: Tell me a story") response = client.recall("agent42", "story context") |
Three usage modes are supported: default presets, modified configs, or automatic selector sweeps over models/hyperparameters.
4. Extensibility Mechanisms
MemEngine supports extensive customization:
- Plugins (Functions/Operations/Models):
MemoryFunctionBase, MemoryOperationBase, and MemoryModelBase are subclassable. Registration hooks allow dynamic addition of components.
Example of a new MemoryFunction:
1 2 3 4 5 |
from memengine.functions import MemoryFunctionBase, FunctionRegistry class BiasJudge(MemoryFunctionBase): def call(self, text: str) -> float: return some_custom_bias_score(text) FunctionRegistry.register("BiasJudge", BiasJudge) |
- Custom Operations and Models:
MemoryOperationBase and MemoryModelBase can be subclassed and registered, allowing arbitrary retrieval, management, and reflection strategies.
Example for a custom recall operation:
1 2 3 4 5 6 7 |
from memengine.operations import MemoryOperationBase, OperationRegistry class CustomRecallOp(MemoryOperationBase): def recall(self, query, store): candidates = store.get_all() return sorted(candidates, key=lambda d: my_custom_score(query,d))[:3] OperationRegistry.register("CustomRecall", CustomRecallOp) |
- Custom Storage:
Storage can be redirected to any backend supporting put/get/query.
This suggests highly granular control over agent behavior, memory management policies, and backend integration.
5. Comparison to Prior Agent Memory Libraries
MemEngine differs from precedent frameworks (AutoGen, MetaGPT, CAMEL, Memary, Cognee) by implementing every major research memory model and treating every subcomponent as pluggable. Other libraries typically support only a single retrieval or history model. Table 1 (from the source) illustrates comparative feature coverage:
| Feature | AutoGen/MetaGPT/CAMEL | Memary/Cognee/Mem0 | MemEngine |
|---|---|---|---|
| Plug-and-play Integration | ✓ | ✓ | ✓ |
| Basic Read/Write | ✓ | ✓ | ✓ |
| Reflection & Optimization | ✓ | ✓ | ✓ |
| Comprehensive Default Models | ✗ (1–2 presets) | ✗ | ✓ (9+) |
| Advanced Model Customization | ✗ | ✗ | ✓ |
A plausible implication is that MemEngine’s design supports more rigorous benchmarking and enables the invention of hybrid or novel memory schemes by mixing and matching established modules.
6. Deployment Strategies and Best Practices
Deployment guidance is specified for reproducible experimentation and resource optimization:
- Automatic selector mode is recommended for initial model selection (e.g., sweep STMemory, LTMemory, GAMemory against a dev set).
- Tuning is strictly performed via config files.
- Heavyweights (retrieval, summarization/reflection) should use remote storage/vector clusters.
- Built-in utilities allow visualization of memory access patterns and debugging recall events.
- Compositional use of models (e.g., combine STMemory for immediate context, LTMemory for global retrieval, MBMemory for periodic summarization) is encouraged.
- Token usage should be profiled; summarization/truncation should limit memory size and cost.
- Forgetting thresholds must be actively monitored when employing MBMemory-style summarization and compression.
Available documentation and codebases are maintained at https://github.com/nuster1128/MemEngine and https://memengine.readthedocs.io.
7. Significance and Research Utility
By offering a unified interface, rigorous modularization, and broad coverage of canonical memory models, MemEngine enables systematic benchmarking, rapid prototyping, and the creation of custom agent memory architectures without redundant re-implementation. Its extensibility supports direct translation of conceptual advances in agent memory into production or experimental pipelines, facilitating progress in robust, context-aware LLM agent research while supporting transparent ablation and comparison of alternative memory paradigms (Zhang et al., 4 May 2025).
Sponsored by Paperpile, the PDF & BibTeX manager trusted by top AI labs.
Get 30 days free