Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 65 tok/s
Gemini 2.5 Pro 53 tok/s Pro
GPT-5 Medium 35 tok/s Pro
GPT-5 High 34 tok/s Pro
GPT-4o 99 tok/s Pro
Kimi K2 182 tok/s Pro
GPT OSS 120B 458 tok/s Pro
Claude Sonnet 4.5 38 tok/s Pro
2000 character limit reached

MemoryLLM: Self-Updatable Memory Pools

Updated 24 September 2025
  • MemoryLLM is a self-updatable memory pool architecture that enables large language models to dynamically absorb and integrate new information.
  • It employs a split design with static backbone weights and mutable memory parameters updated via an injection function to maintain long-term retention and controlled forgetting.
  • Empirical evaluations demonstrate superior model editing, long-context handling, and operational robustness even after millions of memory updates.

A self-updatable memory pool, as exemplified by MemoryLLM, is a mechanism that endows LLMs with the capacity to dynamically absorb and retain new knowledge throughout deployment. Rather than remaining static after training, such models can self-modify a substantial subset of their parameters—the memory pool—thereby facilitating knowledge injection, long-term retention, operational robustness under continual updates, and empirical controllability. This paradigm directly addresses the limitations of fixed parametric memory and opens new avenues for long-context reasoning, model editing, and real-world knowledge assimilation.

1. MemoryLLM Architecture and Latent Space Memory Pools

MemoryLLM augments a transformer-based LLM (notably Llama2) with a fixed-size memory pool embedded within the latent space of every transformer layer. In each layer ll, the memory pool is represented as NN hidden vectors of dimension dd, denoted as θlRN×d\theta_l \in \mathbb{R}^{N \times d}. The complete model comprises static backbone weights φ\varphi and the self-updatable memory parameters θ\theta, forming the composite model M(θ,φ)\mathcal{M}_{(\theta,\varphi)}.

During forward pass, both the input sequence tokens and all memory pool tokens are processed simultaneously by the self-attention modules, allowing each token to attend to the compressed knowledge stored in the layer’s memory pool. This explicit division between persistent parameters φ\varphi and mutable memory θ\theta ensures that new knowledge can be integrated at multiple abstraction levels without perturbing the core backbone.

Upon encountering new textual knowledge xx, the entire memory pool is updated via an injection function UU: θ=U(θ,x)\theta' = U(\theta, x) For a sequence of updates (x1,,xn)(x_1, \ldots, x_n),

θn=U(U(U(θ,x1),x2),,xn)\theta_n = U(\ldots U(U(\theta, x_1), x_2), \ldots, x_n)

This persistent yet bounded latent memory architecture creates a split-knowledge substrate: static truths in φ\varphi, updatable facts in θ\theta.

2. Self-Update Mechanism and Controlled Forgetting

The update mechanism for each layer is incremental, operating on a sliding window. For new context xcx_c, the process is:

  • Extract the last KK memory tokens {eθl}\{e_\theta^l\} from θl\theta_l.
  • Concatenate these KK tokens with the hidden states hlh_l derived from xcx_c.
  • Pass this extended sequence through the transformer layer φl\varphi_l.
  • Overwrite the last KK entries in θl\theta_l with the last KK output tokens, and shift the memory tokens: older memory is “pushed out” as updates are ingested.

This yields exponential memory decay. After each update, a fraction K/NK/N is replaced; thus, the retention for a token after TT updates is: Retention(1K/N)T\text{Retention} \approx (1 - K/N)^T After N/KN/K updates, this retention approaches $1/e$, closely mapping to the Ebbinghaus Forgetting Curve and allowing both efficient freshness and graceful forgetting. Knowledge is thus neither immediately erased nor compressed into parametrization, but managed under explicit capacity constraints.

3. Empirical Evaluation and Benchmarks

MemoryLLM’s empirical properties are established via standard and custom benchmarks:

  • Model Editing: On zsRE and CounterFactual, memory editing performance is evaluated on three axes: efficacy (recall of newly injected facts), generalization (response accuracy to rephrased/related queries), and specificity (preservation of unrelated facts). MemoryLLM with editing enforcement (w/EF) surpasses fine-tuning, ROME, IKE, and others.
  • Long Context Handling: On LongBench, MemoryLLM maintains or improves F1 as sequence lengths increase, outperforming backbone models by leveraging its memory pool for persistent knowledge recall.
  • Retention Trajectory: On SQuAD and NaturalQA, after 20 sequential injections, theoretical and measured knowledge retention are compared. The model approaches the exponential decay upper bound, retaining substantial knowledge beyond natural parametric memory capacity.
  • Operational Robustness: Over 650,000–1,000,000 memory updates, no degradation in updated-fact recall or overall prediction performance is observed, confirming the regularized integrity of memory self-updates.
Benchmark Metric MemoryLLM (w/EF) Performance
zsRE, CounterFactual Efficacy Stronger than FT, ROME, IKE, etc.
LongBench (long context) F1 Score Maintained/improved with context length
SQuAD/NaturalQA (retention) Recall Approaches exponential decay upper bound

4. Applications and Deployment Considerations

MemoryLLM’s self-updatable design is applicable to a range of knowledge-intensive and long-context scenarios:

  • Dynamic Fact Integration: Enables continual knowledge injection for domains with rapid knowledge change (e.g., real-time news, scientific updates).
  • Model Editing: Supports surgical updates and corrections without full retraining, applicable to chatbots, virtual assistants, and interactive systems requiring factual grounding.
  • Long-context and Multi-turn QA: Maintains long-range coherence across large context windows, supporting summarization and agent dialog.
  • Continual Learning and Robustness: Demonstrated integrity after 10610^6 updates makes it suitable for persistent deployments in environments with streaming, evolving data.

Crucially, self-updatable memory pools can be integrated with other transformer architectures or extended to multi-modal settings. Provided model code and checkpoints (available at the cited repository) facilitate reproducibility and adaptation to different domains.

5. Open Source Release and Accessibility

The MemoryLLM codebase, including self-update algorithms, memory pool management, and training routines, is openly available. This enables:

  • Replication of empirical results and independent verification of memory update integrity.
  • Adaptation and extension to other LLMs or specialized architectures.
  • Community-driven research into dynamic memory integration and efficient long-context retention.

Such openness is intended to accelerate both benchmark-oriented development and practical deployment of self-updatable LLMs.

6. Theoretical and Practical Implications

The MemoryLLM framework embodies several critical theoretical and engineering advances:

  • Split Knowledge Representation: Decouples persistent and adaptive knowledge, enabling explicit operational control over updatable regions.
  • Efficient Exponential Forgetting: Balances freshness and retention, minimizing catastrophic forgetting while avoiding stale knowledge accumulation.
  • Operational Stability: Achieves robust continual operation at scale, which is a prerequisite for persistent, adaptive AI agents.
  • Practical Deployment: Supports robust, just-in-time knowledge updates with minimal overhead, suitable for real-world production systems.

Integration of self-updatable memory pools in LLMs signals a shift from static, monolithic modeling toward incremental, governable AI systems capable of adapting autonomously to dynamic information flows, extensive contexts, and evolving requirements.

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Self-Updatable Memory Pools (MemoryLLM).