Papers

Topics

Authors

Recent

View all

Detailed Answer

Quick Answer

Concise responses based on abstracts only

Detailed Answer

Well-researched responses based on abstracts and relevant paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses

Gemini 2.5 Flash

Gemini 2.5 Flash 54 tok/s

Gemini 2.5 Pro 50 tok/s Pro

GPT-5 Medium 18 tok/s Pro

GPT-5 High 31 tok/s Pro

GPT-4o 105 tok/s Pro

Kimi K2 182 tok/s Pro

GPT OSS 120B 466 tok/s Pro

Claude Sonnet 4 40 tok/s Pro

2000 character limit reached

Memp: Exploring Agent Procedural Memory (2508.06433v2)

Published 8 Aug 2025 in cs.CL, cs.AI, cs.LG, and cs.MA

Abstract: LLMs based agents excel at diverse tasks, yet they suffer from brittle procedural memory that is manually engineered or entangled in static parameters. In this work, we investigate strategies to endow agents with a learnable, updatable, and lifelong procedural memory. We propose Memp that distills past agent trajectories into both fine-grained, step-by-step instructions and higher-level, script-like abstractions, and explore the impact of different strategies for Build, Retrieval, and Update of procedural memory. Coupled with a dynamic regimen that continuously updates, corrects, and deprecates its contents, this repository evolves in lockstep with new experience. Empirical evaluation on TravelPlanner and ALFWorld shows that as the memory repository is refined, agents achieve steadily higher success rates and greater efficiency on analogous tasks. Moreover, procedural memory built from a stronger model retains its value: migrating the procedural memory to a weaker model yields substantial performance gains.

Collections

Summary

The paper introduces Mem^p as a procedural memory framework that encodes agent trajectories into reusable templates.
It employs build, retrieval, and update mechanisms, using query-vector matching and reflection to optimize long-horizon task performance.
Experimental results show enhanced task success and reduced steps, highlighting procedural memory transfer from robust to weaker models.

Procedural Memory in AI Agents: Insights and Framework

Procedural memory plays a vital role in the cognitive capabilities of AI agents, akin to its significance in human cognitive processes. This paper explores the investigation of procedural memory in LLM-based agents, discussing strategies for building, retrieving, and continuously updating this form of memory within a dynamic AI environment.

Framework for Procedural Memory

Procedural Memory Model

The framework, named $Mem^p$ , treats procedural memory as a high-level optimization object. It aims to encode past agent trajectories from step-by-step instructions to script-like abstractions, providing a structured repository that evolves with new experiences.

Figure 1: The procedural memory framework consists of Build, Retrieve, and Update, which respectively involve encoding stored procedural memory, forming new procedural memories, and modifying existing ones in light of new experiences.

Build Phase

The build phase constitutes capturing the trajectories from previous agent interactions. $Mem^p$ leverages these historical trajectories to create procedural templates that guide future interactions, essentially distilling this data into reusable knowledge.

Retrieval Phase

During retrieval, $Mem^p$ employs various strategies such as query-vector matching and keyword-vector matching to select the most relevant procedural memory. This retrieval process ensures that valuable past insights can be effectively utilized to tackle similar upcoming tasks.

Memory Update Mechanisms

Memory updating is implemented through strategies such as validation filtering and reflection-based mechanisms. These enable the procedural memory repository to dynamically adjust, discard obsolete or erroneous data, and augment valuable new information.

Figure 2: Reward gain and steps reduction vs. trajectory group index with procedural memory.

Experimental Analysis

Datasets and Models

The framework was empirically validated on diverse datasets including TravelPlanner and ALFWorld. These tests scrutinized agents' ability to deal with long-horizon tasks requiring procedural knowledge.

Results

Results demonstrated marked improvements in both task success rates and computational efficiency with procedural memory in place. The framework was instantiated using state-of-the-art LLMs like GPT-4o, Claude-3.5-sonnet, and Qwen2.5-72B-Instruct, showing superior performance relative to approaches lacking memory integration.

Figure 3: With procedural memory, agents can improve both the success rate (accuracy â) and execution efficiency (steps â) when solving similar tasks.

Insights and Implications

Procedural Memory Transferability

The paper showcased the transferability of procedural memory from stronger models to weaker ones, illustrating how knowledge from a robust system can enhance the operational efficiency of less capable models.

Figure 4: (a) Transfer result of GPT-4o's procedural memory to Qwen2.5-14B-Instruct and its performance on TravelPlanner dataset.(b) The relationship between the quantity of procedural memory retrieved for GPT-4o's performance on the ALFWorld dataset.

Scaling Considerations

Scalability was observed as an advantage of vector-based retrieval, allowing agents to sift through extensive repositories of procedural knowledge to optimize performance on novel tasks.

Figure 5: Compare trajectories with and without procedural memory, shortens the process by 9 steps and saves 685 tokens.

Conclusion

Procedural memory significantly enhances the cognitive capabilities and adaptability of AI agents, marking an essential step toward self-improving systems. Future work can explore even more sophisticated retrieval strategies and judge-based task completion assessments, enabling agents to further refine their expertise dynamically. By fostering memory transferability and efficient update mechanisms, $Mem^p$ promises noteworthy advancements in AI agent capabilities.