Papers
Topics
Authors
Recent
Search
2000 character limit reached

Self-Adaptive Long-term Memory (SALM)

Updated 26 February 2026
  • Self-Adaptive Long-term Memory (SALM) is a framework that endows AI systems with continuously evolving, structured memory for lifelong adaptation and robust long-term reasoning.
  • It integrates modules for storage, retrieval, updating, and consolidation, enabling dynamic management of experiences and efficient memory pruning.
  • Empirical implementations like OMNE/GAIA, MAPLE, and FALCON demonstrate SALM’s effectiveness in boosting performance on diverse tasks such as logical deduction, question answering, and code generation.

Self-Adaptive Long-term Memory (SALM) refers to a class of AI architectures and algorithms that equip foundation models or intelligent agents with a continuously evolving, structured memory system. Unlike fixed parametric memory (weights) or limited working context, SALM provides mechanisms for dynamically storing, retrieving, updating, consolidating, and pruning experiences encountered across long-term interactions. The core objective is to enable self-evolution during deployment, supporting lifelong, personalized adaptation, stable knowledge retention, and robust reasoning over extended temporal horizons (Jiang et al., 2024, He et al., 2024).

1. Architectural Principles and Formalism

The canonical SALM architecture augments a (typically frozen) foundation model with four interacting modules:

  1. Memory Storage Unit: Persists structured experiences as entries—often key–value–weight tuples (k,v,w)(k, v, w), where kRdk \in \mathbb{R}^d is a context embedding, vv encodes the stored content (raw fragment, summary node, or expert trace), and w0w \geq 0 tracks usage or importance.
  2. Memory Retrieval Unit: Given a query qtq_t, computes a relevance score over stored entries (e.g., via scaled dot-product attention or hybrid keyword-dense retrieval), returning the top-KK most relevant items for inference-time augmentation.
  3. Memory Update Unit: On observing a new input xtx_t, the unit updates memory Mt1MtM_{t-1} \to M_t by selective addition, merging, or adaptive pruning based on thresholds (τadd\tau_\text{add}, τmerge\tau_\text{merge}, ϵ\epsilon). Decay rates ρ<1\rho<1 modulate weight forgetting.
  4. Memory Consolidation Unit: Periodically reorganizes memory content using summarization, clustering, or graph sparsification, adjusting granularity between fine-grained episodic traces and abstract semantic/procedural summaries.

Each module's behavior is governed by a self-adaptation policy informed by real-time usage metrics (e.g., retrieval frequencies, novelty scores), allowing dynamic tuning of thresholds, decay, and consolidation rates (Jiang et al., 2024). Memory is often organized into columns specializing in specific semantic or episodic streams, with per-column adaptation of granularity and retention policy.

Formal update: For memory state Mt={mti=(kti,vti,wti)}i=1NtM_t = \{m_t^i = (k_t^i, v_t^i, w_t^i)\}_{i=1}^{N_t}, the update function when xtx_t arrives is:

  1. Retrieve similar entries S={i:sim(kt1i,et)>τadd}S = \{i : \operatorname{sim}(k_{t-1}^i, e_t) > \tau_\text{add}\}
  2. Merge or append:
    • If iS\exists i \in S with sim>τmerge\operatorname{sim} > \tau_\text{merge}, update mtimerge(mt1i,xt)m_t^i \gets \operatorname{merge}(m_{t-1}^i, x_t)
    • Else append mtnew=(et,xt,winit)m_t^\text{new} = (e_t, x_t, w_\text{init})
  3. Update/decay weights: i,wti=ρwt1i+αδ[iS]\forall i,\, w_t^i = \rho w_{t-1}^i + \alpha \delta[i \in S]
  4. Prune: Remove entries with wti<ϵw_t^i < \epsilon

Retrieval scores for a query qtq_t typically use rti=exp(qtkt1i/d)/jexp(qtkt1j/d)r_t^i = \exp(q_t \cdot k_{t-1}^i/\sqrt{d}) / \sum_j \exp(q_t \cdot k_{t-1}^j/\sqrt{d}) (Jiang et al., 2024).

2. Cognitive and Theoretical Underpinnings

SALM generalizes classical cognitive architectures (ACT-R, Soar, Sigma) and extends the Standard Model of the Mind by unifying all six long-term memory types—parametric/non-parametric × episodic/semantic/procedural. Crucially, SALM introduces systematic "adapters" (policies) enabling online adaptation of storage, retrieval, and forgetting:

  • Human-to-AI mappings:
    • Episodic memory—event buffers and timestamped recurrent nets
    • Semantic memory—external knowledge bases/vectors (non-parametric) and classifiers/segmenters (parametric)
    • Procedural memory—RL policies and production-rule/code update engines

Adapters for storage (πs\pi_s), retrieval (πr\pi_r), and forgetting (πf\pi_f) receive supervision from downstream performance and are typically updated by policy gradient methods. The overall controller determines whether to update the parametric memory θ\theta, write to non-parametric store M\mathcal{M}, or skip storage, and analogously for retrieval and forgetting (He et al., 2024).

Module flow:

  1. Encode input: mtEnc(xt,st)m_t \leftarrow \operatorname{Enc}(x_t, s_t)
  2. Storage adapter samples action at{storeparam,storenonparam,skip}a_t \in \{\text{store}_\text{param}, \text{store}_\text{nonparam}, \text{skip}\}
  3. If parametric, update θθηθL(θ;mt)\theta \leftarrow \theta - \eta \nabla_\theta \mathcal{L}(\theta; m_t); if non-parametric, MM{mt}\mathcal{M} \gets \mathcal{M} \cup \{m_t\}
  4. Evaluate and reinforce adapters from immediate or downstream task metrics

3. Empirical Frameworks and Algorithmic Variations

Numerous empirical SALM systems validate these principles:

  • OMNE/GAIA: A multi-agent system with agents maintaining independent long-term memories, collaborating via retrieval-augmented generation. Achieves state-of-the-art on 400+ logical deduction tasks (GAIA: test acc. 40.53%, validation 46.06%, Level-3 hard Qs 26.53%) (Jiang et al., 2024).
  • MAPLE: A table-question-answering pipeline where agent experience is distilled into "memory notes" by an Archiver module. Retrieval and memory evolution are formalized by threshold-based clustering and graph updates, supporting multi-agent feedback loops. Empirically, adding SALM boosts WikiTQ accuracy from 71.09 → 74.01 (+2.92); memory system dynamics optimized at moderate similarity thresholds (δ ≈ 0.7) (Bai et al., 6 Jun 2025).
  • FALCON: In code generation, a global long-term buffer indexed by FAISS stores (task, code, feedback) tuples. Meta-reinforcement learning alternates inner-loop task-local adaptation with outer-loop global consolidation, implementing a form of dual-level SALM. Experiments show SOTA on MBPP and Humaneval benchmarks (Li et al., 2024).

Training/inference pseudocode and formal objectives appear in the primary references and reflect a consensus loop: encode, retrieve, augment prompt/context, infer, update/prune memory, consolidate periodically.

4. Variant Mechanisms and Implementation Trade-offs

SALM instantiations vary across domains and task demands:

  • MemoryBank: Employs an Ebbinghaus Forgetting Curve for decay, with memory strength SiS_i reinforced when recalled and retention Ri(ti)=exp(ti/Si)R_i(t_i) = \exp(-t_i/S_i). Automated pruning occurs below a threshold. Used in long-term dialog agents demonstrating high retrieval accuracy (0.80+\sim0.80+) and robust contextual adaptation (Zhong et al., 2023).
  • LEMN: Retention agent (RNN policy) assigns replace/retain probabilities per memory slot, based on spatial and temporal context, optimized by RL on task rewards. Shows dominant gains in streaming QA and RL environments, excelling in noisy/long-horizon regimes (Jung et al., 2018).
  • FluxMem: Memory is organized in a three-level hierarchy (short-term, mid-term, long-term), with context-aware structure selection, and distribution-aware fusion using a Beta Mixture Model gate for dynamic session merging. Offline-trained structure selectors and unsupervised EM for mixture gating yield robust adaptation to interaction heterogeneity, with 9.18% accuracy gain over best fixed-structure baselines (Lu et al., 15 Feb 2026).
  • SALM in Online Learning/Bandits: SALM is formalized as a reduction for achieving long-term memory regret bounds of O(T(SlnT+nlnK))O(\sqrt{T(S\ln T + n\ln K)}), combining static-regret and adaptive switching-regret subroutines to efficiently "remember" and revisit optimal expert policies (Zheng et al., 2019).

Practical considerations involve sub-linear retrieval using approximate nearest neighbor indices (FAISS, ANNOY), asynchronous consolidation to minimize latency, per-session ephemeral or encrypted memory for privacy, and column-based sharding for scalability (Jiang et al., 2024).

5. Integration with Foundation Models and Retrieval-Augmented Generation (RAG)

SALM modules interface with LLMs, vision transformers, and RL agents through retrieval-augmented inference pipelines:

  • At each step, the retrieval unit indexes memory for relevant traces based on the prompt or sensory input.
  • Retrieved content is incorporated as additional context (e.g., as RAG chunks, graph nodes, or prompt prefix).
  • The update/consolidation mechanism integrates resulting output/feedback, enabling continuous model evolution without re-training of backbone weights.

Empirical studies demonstrate that column-wise retention and adaptive consolidation outpace global LRU (12% gain in conversational settings), while real-time decay prevents catastrophic forgetting (>95% retention over 10k+ interactions) (Jiang et al., 2024).

Design recommendations include monitoring access distributions at the column/session level; using LoRA or other parameter-efficient tuning for consolidating high-level memory summaries; and maintaining audit logs and differential privacy guarantees for personal data (Jiang et al., 2024).

6. Research Applications, Evaluation, and Future Directions

SALM has demonstrated utility in domains spanning language (long-horizon dialog, video understanding, table QA), code generation, continual RL, personalization, and bandit/online learning (Jiang et al., 2024, Bai et al., 6 Jun 2025, Li et al., 2024, Jung et al., 2018, Zheng et al., 2019). Key evaluation metrics include retrieval accuracy, downstream answer correctness, contextual coherence, storage/retrieval/forgetting F1, NDCG@K, and long-term retention on continual-learning benchmarks (He et al., 2024).

Ablation studies consistently indicate the critical role of adaptive consolidation, per-column structure, and memory selection policies. Theoretical studies show that SALM’s meta-algorithmic strategies for exploitation/exploration and structure selection yield provable memory-efficiency and regret bounds.

Future directions focus on:

  • End-to-end trainable SALM instantiation in large LLM architectures and multi-modal models
  • Improved reward signals and online adapter policies for supervision from long-horizon objectives
  • Comparative analyses of encoding and memory fusion strategies (contrastive, autoencoding, mixture-of-experts)
  • Advanced forgetting/summarization based on compression or deduplication
  • Extension to regulatory-compliant, privacy-preserving agent deployments

7. Summary Table: SALM Key Modules and Functions

Module Function Adaptation Strategies
Storage Store processed experience entries Reinforcement via recall, usage/stat
Retrieval Compute query-relevance, select top-K memories Hybrid attention, session bias
Update Add, merge, or prune entries Decay, selectivity, novelty gating
Consolidation Summarize & compress memory, adjust granularity Column restructuring, clustering
Controllers/Adapters Monitor and tune thresholds/hyperparameters RL-based feedback, offline meta-RL

SALM operationalizes a unifying, modular approach for evolving, scalable, and adaptive long-term memory in AI systems, bridging the gap between ephemeral context usage and rigid, static model weights. Empirical and theoretical advances establish SALM as foundational for next-generation self-evolving intelligent agents (Jiang et al., 2024, He et al., 2024, Bai et al., 6 Jun 2025, Lu et al., 15 Feb 2026).

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Self-Adaptive Long-term Memory (SALM).