KnowAgent: Knowledge-Augmented Planning for LLM-Based Agents (2403.03101v1)

Published 5 Mar 2024 in cs.CL, cs.AI, cs.HC, cs.LG, and cs.MA

Abstract: LLMs have demonstrated great potential in complex reasoning tasks, yet they fall short when tackling more sophisticated challenges, especially when interacting with environments through generating executable actions. This inadequacy primarily stems from the lack of built-in action knowledge in language agents, which fails to effectively guide the planning trajectories during task solving and results in planning hallucination. To address this issue, we introduce KnowAgent, a novel approach designed to enhance the planning capabilities of LLMs by incorporating explicit action knowledge. Specifically, KnowAgent employs an action knowledge base and a knowledgeable self-learning strategy to constrain the action path during planning, enabling more reasonable trajectory synthesis, and thereby enhancing the planning performance of language agents. Experimental results on HotpotQA and ALFWorld based on various backbone models demonstrate that KnowAgent can achieve comparable or superior performance to existing baselines. Further analysis indicates the effectiveness of KnowAgent in terms of planning hallucinations mitigation. Code is available in https://github.com/zjunlp/KnowAgent.

PDF Abstract

KnowAgent: Knowledge-Augmented Planning for LLM-Based Agents

The paper "KnowAgent: Knowledge-Augmented Planning for LLM-Based Agents" addresses a notable limitation in LLMs when applied to complex reasoning tasks, particularly when these models are required to generate executable actions in dynamic environments. The core issue identified is the phenomenon of "planning hallucination," where the models output illogical or ungrounded sequences of actions, reflecting a fundamental inadequacy in built-in action knowledge. In response, the authors propose KnowAgent, a novel approach that enhances LLMs' planning capabilities by incorporating explicit action knowledge.

Introduction

State-of-the-art LLMs exhibit substantial potential in various reasoning tasks; however, their capabilities are truncated by the absence of intrinsic action knowledge. This absence has a pronounced detrimental effect on the models’ performance, evident in erratic planning trajectories and frequent planning hallucinations. KnowAgent seeks to address these shortcomings by integrating an action knowledge base and a knowledgeable self-learning mechanism. Through this integration, the framework aims to constrain planning pathways, producing more coherent and logically sound task executions.

Methodology

Figure $\ref{fig:overall}$ encapsulates the architecture of KnowAgent. The method involves three pivotal components:

Definition of Action Knowledge: This entails specifying actions pertinent to particular tasks and the logical rules governing their transitions. Action knowledge is represented as $(E_a, \mathcal{R})$ , comprising a set of discrete actions $E_a$ and action rules $\mathcal{R}$ that dictate valid action transitions.
Action Knowledge to Text: This process converts the structured action knowledge into textual descriptions, allowing the LLM to interpret and utilize this knowledge effectively in planning.
Planning Path Generation: This utilizes prompts informed by action knowledge to guide LLMs in generating comprehensive planning paths, ensuring that the paths align with predefined logical rules.
Knowledgeable Self-Learning: Over multiple iterations, the model refines its understanding of action knowledge through a self-learning mechanism that synthesizes, filters, and fine-tunes on trajectories to enhance planning accuracy continuously.

Experimental Results

The effectiveness of KnowAgent is demonstrated through extensive experiments on two datasets: HotpotQA and ALFWorld. These trials involve various versions of the Llama model (7B, 13B, 70B parameters). When compared to baseline methods such as CoT, ReAct, Reflexion, and FiReAct, KnowAgent consistently exhibits superior performance.

Key Findings

Performance Metrics: KnowAgent achieves significant improvements in F1 scores on HotpotQA and success rates on ALFWorld. Specifically, Llama-2-13B with KnowAgent shows a notable increase of approximately 15.09% and 37.81% in comparison to ReAct on HotpotQA and ALFWorld respectively.
Reduction of Invalid Actions: The incorporation of action knowledge leads to a marked reduction in planning hallucinations. The error rates for invalid and misordered actions are significantly lower in KnowAgent compared to ReAct and Reflexion.
Iteration Benefits: Iterative self-learning amplifies the model's performance, reinforcing the symbiotic relationship between action knowledge and self-improvement cycles. More iterations correlate with better adaptation and integration of action knowledge.

Implications and Future Developments

The research has significant implications for the enhancement of LLM-based agents:

Theoretical:
- Demonstrates the feasibility and benefits of integrating structured knowledge into LLMs for better planning performance.
- Illustrates the effectiveness of knowledgeable self-learning in mitigating planning hallucinations.
Practical:
- Facilitates the construction of more reliable and efficient LLM-based agents for complex environments, potentially accelerating advancements in fields such as automated customer service, intelligent personal assistants, and autonomous systems.

Future developments could explore:

Task Diversity: Extending the framework's applicability to a broader range of tasks, including medical diagnosis, web browsing, and embodied robotics.
Automation in Knowledge Creation: Enhancing automation in the creation of action knowledge bases to reduce manual effort and improve scalability.
Multi-Agent Systems: Applying the framework within multi-agent contexts to observe collaborative planning enhancements.

Conclusion

KnowAgent presents a substantial advancement in the field of LLM-based agents by addressing the persistent issue of planning hallucinations through the incorporation of structured action knowledge and a robust self-learning mechanism. The empirical results affirm the efficacy of this approach, marking a step towards more sophisticated and reliable AI agents capable of navigating and interacting within complex environments.