HiAgent: Hierarchical Working Memory Management for Solving Long-Horizon Agent Tasks with Large Language Model

Published 18 Aug 2024 in cs.CL, cs.AI, and cs.RO | (2408.09559v1)

Abstract: LLM-based agents exhibit significant potential across various domains, operating as interactive systems that process environmental observations to generate executable actions for target tasks. The effectiveness of these agents is significantly influenced by their memory mechanism, which records historical experiences as sequences of action-observation pairs. We categorize memory into two types: cross-trial memory, accumulated across multiple attempts, and in-trial memory (working memory), accumulated within a single attempt. While considerable research has optimized performance through cross-trial memory, the enhancement of agent performance through improved working memory utilization remains underexplored. Instead, existing approaches often involve directly inputting entire historical action-observation pairs into LLMs, leading to redundancy in long-horizon tasks. Inspired by human problem-solving strategies, this paper introduces HiAgent, a framework that leverages subgoals as memory chunks to manage the working memory of LLM-based agents hierarchically. Specifically, HiAgent prompts LLMs to formulate subgoals before generating executable actions and enables LLMs to decide proactively to replace previous subgoals with summarized observations, retaining only the action-observation pairs relevant to the current subgoal. Experimental results across five long-horizon tasks demonstrate that HiAgent achieves a twofold increase in success rate and reduces the average number of steps required by 3.8. Additionally, our analysis shows that HiAgent consistently improves performance across various steps, highlighting its robustness and generalizability. Project Page: https://github.com/HiAgent2024/HiAgent .

Abstract PDF HTML Upgrade to Chat

Citations (1)

View on Semantic Scholar

Summary

The paper introduces PCArena, a hierarchical framework that uses subgoals to segment and summarize working memory for long-horizon tasks.
It achieves a twofold increase in success rate by reducing average task steps by 3.8 and cutting context length by 35.02%.
The method integrates trajectory retrieval and proactive memory replacement, highlighting its potential for robotics and interactive systems.

Detailed Summary of "HiAgent: Hierarchical Working Memory Management for Solving Long-Horizon Agent Tasks with LLM"

Introduction

The paper "HiAgent: Hierarchical Working Memory Management for Solving Long-Horizon Agent Tasks with LLM" (2408.09559) presents a framework, PCArena, designed to enhance the performance of LLM-based agents in solving long-horizon tasks. These agents are tasked with generating actions based on environmental observations, and their efficacy heavily relies on memory mechanisms that store past action-observation sequences. The paper identifies a gap in current research around optimizing in-trial memory or working memory, as opposed to cross-trial memory, and addresses this through PCArena, which hierarchically manages working memory by leveraging subgoals as memory chunks.

Hierarchical Working Memory Management

PCArena's methodology diverges from traditional paradigms by integrating human-inspired problem-solving strategies, specifically the use of subgoals to manage cognitive load efficiently. Typically, existing systems directly input entire sequences of past actions and observations into LLMs during task-solving, leading to inefficiency and redundancy, especially in long-horizon tasks. PCArena mitigates this by prompting LLMs to formulate subgoals, which serve as memory chunks. Subsequently, agents summarize these memory chunks after subgoals are achieved, retaining only relevant action-observation pairs for future decision-making.

Figure 1: A standard paradigm for LLM-based agents, demonstrating top-standard and bottom-PCArena strategies in managing working memory.

The strategic reduction in memory consumption results in a twofold increase in success rate across various tasks, demonstrating PCArena's robustness and efficiency.

Methodological Details

PCArena functions by synchronizing the generation of subgoals with executable actions and uses a summarization strategy to trim the working memory. The process is detailed as follows:

Subgoal Generation: LLMs are prompted to establish subgoals, each considered a memory chunk.
Action Execution and Summarization: Upon achieving a subgoal, the associated action-observation pairs are summarized, compacting the working memory efficiently.
Proactive Memory Management: LLMs proactively determine necessary memory replacements with summarized observations based on the current subgoals.
Figure 2: An overview of the PCArena process illustrating subgoal-based hierarchical memory management.

Further, PCArena incorporates a trajectory retrieval module, retrieving detailed past trajectories on demand for improved decision-making, reinforcing flexibility.

Experimental Framework

The effectiveness of PCArena is validated across five long-horizon tasks, demonstrating significant improvements in success rate, progress rate, and execution efficiency. The experimental outcomes illustrate that PCArena reduces the average task completion steps by 3.8, cuts context length by 35.02%, and run time by 19.42%.

Impact and Future Directions

PCArena's successful deployment in handling long-horizon tasks suggests its potential applicability in various domains requiring complex decision-making and memory management strategies, especially in robotics and interactive systems. The hierarchical memory framework of PCArena could inspire further advancements in developing autonomous systems that simulate human-like problem-solving capabilities by adopting efficient memory management paradigms.

Conclusion

PCArena presents a significant advance in memory management for LLM-based agents by hierarchically structuring working memory around subgoals, thus improving both efficiency and effectiveness in executing long-horizon tasks. This paper contributes a novel perspective to the field of AI, offering a promising approach to enhancing the cognitive capabilities of autonomous agents.