Pre-Act: Multi-Step Planning and Reasoning Improves Acting in LLM Agents (2505.09970v2)

Published 15 May 2025 in cs.AI

Abstract: The ReAct (Reasoning + Action) capability in LLMs has become the foundation of modern agentic systems. Recent LLMs, such as DeepSeek-R1 and OpenAI o1/o3, exemplify this by emphasizing reasoning through the generation of ample intermediate tokens, which help build a strong premise before producing the final output tokens. In this paper, we introduce Pre-Act, a novel approach that enhances the agent's performance by creating a multi-step execution plan along with the detailed reasoning for the given user input. This plan incrementally incorporates previous steps and tool outputs, refining itself after each step execution until the final response is obtained. Our approach is applicable to both conversational and non-conversational agents. To measure the performance of task-oriented agents comprehensively, we propose a two-level evaluation framework: (1) turn level and (2) end-to-end. Our turn-level evaluation, averaged across five models, shows that our approach, Pre-Act, outperforms ReAct by 70% in Action Recall on the Almita dataset. While this approach is effective for larger models, smaller models crucial for practical applications, where latency and cost are key constraints, often struggle with complex reasoning tasks required for agentic systems. To address this limitation, we fine-tune relatively small models such as Llama 3.1 (8B & 70B) using the proposed Pre-Act approach. Our experiments show that the fine-tuned 70B model outperforms GPT-4, achieving a 69.5% improvement in action accuracy (turn-level) and a 28% improvement in goal completion rate (end-to-end) on the Almita (out-of-domain) dataset.

PDF Abstract

Multi-Step Planning and Reasoning in LLM Agents: Analyzing the Pre-Act Framework

The research paper "Pre-Act: Multi-Step Planning and Reasoning Improves Acting in LLM Agents" by Rawat et al. presents an innovative approach to enhancing the performance of LLM agents. Focusing on the integration of extensive reasoning and multi-step planning, the paper proposes the Pre-Act framework as a significant improvement over the existing ReAct paradigm. ReAct, which emphasizes immediate action based on rapid reasoning, forms the backbone of agentic systems, yet often fails in tasks involving long-term sequential planning.

Core Contributions and Methodology

The authors introduce Pre-Act, a mechanism devised to improve agent efficacy through comprehensive multi-step planning. Pre-Act contrasts with the traditional ReAct by assembling a scalable sequential plan, robustly incorporating previous steps and using these as a learning scaffold for subsequent actions. This iterative process allows the agent to build on past performance, progressively refining its final outputs. It is a universal approach applicable to various agents, whether conversational or non-conversational.

To assess agent performance, a novel two-tier evaluation system is recommended. At the turn level, performance is quantified by metrics such as Action Recall, demonstrating a noteworthy 70% improvement over ReAct on the Almita dataset. For broader evaluation, the end-to-end success rate of task completion is tracked, providing a holistic measure of agent proficiency.

A notable facet of the Pre-Act approach is its accommodation of smaller LLMs through fine-tuning. Models such as Llama 3.1, spanning 8 billion to 70 billion parameter scales, have been adjusted using Pre-Act to compete with, or even surpass, existing large-scale models like GPT-4. This innovation achieves a significant 69.5% gain in action accuracy at the turn level and a 28% increase in goal completion rates within the same dataset.

Technical Results and Findings

The technical execution of Pre-Act within LLM frameworks reveals substantial advancements, particularly in the field of agentic AI, where autonomous decisions necessitate high-convoluted reasoning and decision-making. The experiments emphasize that smaller, fine-tuned models can efficiently handle complex reasoning tasks, operating with lower latency and reduced computational expenditure, essential for practical deployment in real-time systems.

In particular, the fine-tuned 70B Llama 3.1 demonstrated robust performance, outperforming proprietary models, showcasing Pre-Act's ability to enable smaller models to deliver superior performance. This suggests a new direction toward optimizing AI models that are both cost-effective and efficient, aligning with commercial aspirations where expense and real-time processing are decisive factors.

Implications and Future Directions

The implications of this research extend beyond simple model enhancement. By refining the cognitive processes governing agent actions, AI applications can become more intelligent and reliable, influencing sectors that depend on automated systems for decision-making and customer interaction. The Pre-Act paradigm not only boosts the reasoning capability of LLMs but also introduces a new benchmark for evaluating agent performance, emphasizing the importance of structured, multi-step approaches in future AI developments.

Looking forward, the future direction includes developing datasets rich in complex scenarios and exception handling to test and train models beyond conventional "happy path" interactions. Furthermore, exploring deterministic evaluation frameworks will provide more robust and accurate appraisals of AI agent performance, overcoming limitations posed by human-mediated assessments or potential biases in LLM judging utilities.

This paper forefronts a transition toward more nuanced, dimensionally aware AI systems capable of intricate reasoning anchored in real-world contexts, potentially fostering advancements across dialog systems, AI-driven customer support, and more. The Pre-Act framework positions itself as a salient advancement towards achieving such visionary AI capabilities.

PDF Markdown Bookmark Chat (Pro)

Authors (6)

Mrinal Rawat (6 papers)
Ambuje Gupta (2 papers)
Rushil Goomer (1 paper)
Alessandro Di Bari (2 papers)
Neha Gupta (45 papers)
Roberto Pieraccini (2 papers)

Related Papers

Find Related Papers

YouTube

Show All Videos