Leveraging Pre-trained Large Language Models with Refined Prompting for Online Task and Motion Planning (2504.21596v1)

Published 30 Apr 2025 in cs.RO and cs.AI

Abstract: With the rapid advancement of artificial intelligence, there is an increasing demand for intelligent robots capable of assisting humans in daily tasks and performing complex operations. Such robots not only require task planning capabilities but must also execute tasks with stability and robustness. In this paper, we present a closed-loop task planning and acting system, LLM-PAS, which is assisted by a pre-trained LLM. While LLM-PAS plans long-horizon tasks in a manner similar to traditional task and motion planners, it also emphasizes the execution phase of the task. By transferring part of the constraint-checking process from the planning phase to the execution phase, LLM-PAS enables exploration of the constraint space and delivers more accurate feedback on environmental anomalies during execution. The reasoning capabilities of the LLM allow it to handle anomalies that cannot be addressed by the robust executor. To further enhance the system's ability to assist the planner during replanning, we propose the First Look Prompting (FLP) method, which induces LLM to generate effective PDDL goals. Through comparative prompting experiments and systematic experiments, we demonstrate the effectiveness and robustness of LLM-PAS in handling anomalous conditions during task execution.

PDF Abstract

Leveraging Pre-trained LLMs for Task and Motion Planning

This paper presents a novel approach to Task and Motion Planning (TAMP) by integrating pre-trained LLMs with refined prompting techniques. The system, known as LLM-PAS, offers a closed-loop task planning and execution solution that promises enhanced robustness in dealing with dynamic environments, thus meeting a critical need in robotic autonomy.

The authors argue for the use of LLMs to not only plan long-horizon tasks but also to improve the execution phase by assessing constraints in real-time, thereby adjusting to conditions that diverge from the closed-world assumption commonly held in traditional planning methods. LLM-PAS introduces the First Look Prompting (FLP) method, utilizing LLMs' reasoning capabilities to generate effective Planning Domain Definition Language (PDDL) goals during replanning procedures. The aim is to increase the practical applicability of robots in environments where real-time feedback and adaptation are vital.

Key Methodological Contributions

Closed-loop Task System: LLM-PAS emphasizes both task planning and its execution phase. By shifting some constraint checks to the execution phase, it facilitates real-time adaptation to environmental anomalies, a notable deviation from traditional TAMP that generally assumes idealized execution conditions.
First Look Prompting (FLP): This prompting method is developed to direct LLMs towards problem-specific reasoning without irrelevant context, hence enhancing the effectiveness of the replanning process by focusing on the practical generation of PDDL goals.
Behavior Tree Integration: The system translates action sequences into Conditional Subtrees of Behavior Trees (CSubBTs), allowing for robust execution and exploration of the constraint space. This design supports the self-adjustment capability of the executor, increasing the reliability of task completion in uncertain environments where predefined actions might fail.

Experimental Validation

The authors validate the LLM-PAS system through systematic experiments in both simulated and real-world settings. The results indicate that the system can successfully handle various types of anomalies during task execution, particularly object loss, action blocking, and unexpected state changes. Comparative tests with other LLM-based reactive systems, such as InnerMonologue and ProgPrompt, highlight the superior generalization capabilities and planning success of the proposed FLP method. The quantitative measurements report improved Success Rates (SR) and reduced Average Success Path Lengths (ASPL), demonstrating the efficiency of LLM-PAS in traditional task domains.

Implications and Future Directions

The implications of this work extend beyond immediate robotics applications to broader AI planning challenges. By delegating logical reasoning to LLMs, the system exemplifies an innovative solution for enhancing robotic adaptability, thus potentially influencing TAMP's role in real-world scenarios. However, as the integration is currently temporary, further refinement and optimization of cross-platform interactions are necessary. The exploration of multimodal LLMs as robot planning agents could present new avenues for research, potentially augmenting both the planning and execution phases in dynamic environments.

In conclusion, the paper delivers a significant contribution to the field, offering a robust framework for intelligent robotic systems to execute complex operations. As the authors acknowledge the growing capabilities of LLMs, it seems prudent to anticipate future iterations that leverage multimodal models, integrating perceptual data for even richer decision-making capabilities.

PDF Markdown Bookmark Chat (Pro)

Authors (5)

Huihui Guo (2 papers)
Huilong Pi (3 papers)
Yunchuan Qin (5 papers)
Zhuo Tang (12 papers)
Kenli Li (40 papers)

Related Papers

Find Related Papers

YouTube

Show All Videos