MemEngine Library
- MemEngine is a unified and modular software library that facilitates the development, composition, and deployment of advanced memory models in LLM-based agents.
- It features a hierarchical architecture with three core layers—Memory Functions, Memory Operations, and Memory Models—enabling consistent configuration and rapid prototyping.
- The library supports rigorous benchmarking and extensibility, allowing researchers to customize models and validate performance metrics such as Recall@5, BLEU, latency, and memory footprint.
MemEngine is a unified and modular software library designed for development, composition, and deployment of advanced memory models in LLM-based agents. Its framework supplies a hierarchical architecture with extensible components, enabling researchers and practitioners to implement, benchmark, and customize multiple recent memory schemes using a consistent API. MemEngine emphasizes modularity, pluggability, and completeness, supporting rapid prototyping and scalable deployment across research and production contexts (Zhang et al., 4 May 2025).
1. Architectural Structure
MemEngine’s architecture organizes all logic and components hierarchically into three core layers, with auxiliary modules for configuration and utilities. Every element is designed to be modular and interchangeable.
- Layer 1: Memory Functions
This layer exposes atomic capabilities including Encoder, Retrieval, Summarizer, Judge, Reflector, Trigger, Forget, Truncation, Utilization, and LLM-wrapper. Each function operates at a granular level, e.g.,
Encoder.embed(text) → ℝᵈ. - Layer 2: Memory Operations
Operations orchestrate combinations of Memory Functions into behaviors such as Store, Recall, Manage, and Optimize. For example,
LTMemoryRecallcomposes Encoder + Retrieval, whereasGAMemoryStoreutilizes Judge + Summarizer. - Layer 3: Memory Models
reset()store(observation: str)recall(query: str) → str or List[str]manage()optimize()
- Configuration Module The hierarchical config system (YAML/JSON/Python-dict) enables override of defaults at any granularity and validates structure via schema mechanisms.
- Utility Module Supports storage backends (in-memory, SQLite, Redis), HTML/CLI visualization, FastAPI remote client/server, and automatic model selector for new tasks.
High-level schema:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
+-----------------------------+
| Memory Models |
| FUMemory, LTMemory, ... |
+---------------+-------------+
|
+---------------v-------------+
| Memory Operations |
| Store, Recall, Manage, ... |
+---------------+-------------+
|
+---------------v-------------+
| Memory Functions |
| Encoder, Retrieval, ... |
+-----------------------------+ |
2. Built-in Memory Model Implementations
MemEngine provides nine memory models, each combining functions and operations through composition. All models implement the five-standard methods listed above, differing in algorithmic internals and operational wiring.
| Model | Key Mechanism | Notable Equation / Feature |
|---|---|---|
| FUMemory | Full buffer | |
| LTMemory | Embedding recall | |
| STMemory | Recent utterances | |
| GAMemory | Weighted/contextual | |
| MBMemory | Multi-layer bank | Summary: |
| SCMemory | Minimal covering | |
| MGMemory | Hierarchical/OS | Process, schedule, I/O primitives |
| RFMemory | Learn-to-memorize | |
| MTMemory | Semantic tree | ; nodes summarized |
The design enables empirical comparison and ablation within a uniform agent context.
3. Programming Interface and Usage Patterns
All essential classes are available via the memengine Python package. Agent workflows leverage the following standardized operations:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
from memengine import MemEngine, MemoryConfig from memengine.models import LTMemory, MTMemory cfg = MemoryConfig.load("ltmemory_default.yaml") engine = MemEngine(config=cfg) lt = LTMemory(name="longterm", config=cfg.models.LTMemory) mt = MTMemory(name="treemem", config=cfg.models.MTMemory) engine.register(lt) engine.register(mt) engine.store(observation_text) response = some_llm(prompt + engine.recall(query_text)) engine.manage() engine.optimize() raw_items = engine.get_storage("longterm").all() |
Custom model configuration (e.g., GAMemory threshold adjustment) and interactive loops are facilitated directly in Python, with hooks for summarization, reflection, and selective recall.
4. Extensibility and Customization Mechanisms
MemEngine’s internals are designed for arbitrary extension at the function, operation, or model level.
- Base Abstractions:
BaseMemoryFunction,BaseMemoryOperation,BaseMemoryModel
- New Functions:
Subclass, implement forward, and register via decorator.
- New Operations:
Subclass, define execute, compose functions as required.
- New Models:
Subclass, implement five interface methods, aggregate operations via composition.
Additional extension points comprise pipeline hooks (e.g., on_before_store), Pydantic-based config validation, and custom FastAPI endpoint integration for distributed/remote use. This design fosters incorporation of novel mechanisms, functions, and workflows with minimal friction.
5. Evaluation and Benchmarking
MemEngine has been evaluated in representative agent-centric tasks:
- Synthetic long-context QA (10K tokens)
- Multi-round dialogue (20 turns)
- Role-playing simulation (50 steps, memory intensity)
Benchmarked metrics:
- Recall Accuracy @ K
- BLEU/ROUGE response quality
- Latency per recall (ms)
- Memory footprint (KB per 1000 tokens)
Selected model results (averaged over five runs):
| Model | Recall@5 | BLEU-2 | Latency (ms) | Footprint |
|---|---|---|---|---|
| FUMemory | 0.72 | 18.4 | 22 | 512 KB |
| LTMemory | 0.85 | 21.7 | 33 | 128 KB |
| MBMemory | 0.88 | 23.1 | 47 | 96 KB |
| MTMemory | 0.91 | 24.5 | 37 | 112 KB |
| RFMemory | 0.89 | 23.8 | 62 | 128 KB |
Tree-structured memory models (MTMemory) achieve the highest Recall@5, while Reflexion models (RFMemory) demonstrate superior learning-to-memorize capability at increased optimization cost.
6. Deployment, Best Practices, and Limitations
- Dependencies: Python ≥ 3.8; PyTorch/TensorFlow for LLM wrappers; HuggingFace Transformers > 4.30; FastAPI (optional); Redis/SQLite (backend).
- Deployment:
- Local install:
pip install memengine - Remote:
memengine serve --port 8000
- Local install:
- Practical Guidance:
Pre-warm encoders; batch memory operations; tune summarization thresholds; monitor database size; prune/summarize regularly; accelerate embeddings/judging with GPU.
- Common Pitfalls:
Unbounded FUMemory may cause token overflow—truncate buffers. Excessive forgetting can drop salient facts—validate retention on held-out data. Configuration mismatches require strict version locking across experiments.
A plausible implication is that MemEngine’s modularity facilitates comparative and incremental research in agent memory while lowering implementation cost and risk of error.
7. Source Availability and Community Usage
MemEngine is distributed open source and publicly accessible at https://github.com/nuster1128/MemEngine. Comprehensive documentation, code examples, and demonstration scripts are provided, supporting both basic and advanced agent architectures.
MemEngine’s standardized interface and extensibility make it a foundational resource for evaluating, composing, and contrasting agent memory models under identical operational conditions (Zhang et al., 4 May 2025).