Leveraging Pre-trained LLMs for Model-based Task Planning with PDDL
The integration of LLMs in AI opens new frontiers in task planning by leveraging their extensive pre-trained knowledge. In their paper, Lin Guan, Karthik Valmeekam, Sarath Sreedharan, and Subbarao Kambhampati introduce a novel framework wherein LLMs, such as GPT-4, are employed to generate explicit world models codified in Planning Domain Definition Language (PDDL), enabling efficient task planning with domain-independent classical planners. This approach capitalizes on the cognitive strengths of LLMs for extracting symbolic representations, addressing limitations linked to correctness and executability of plans generated directly by LLMs.
Framework Overview
The authors propose a two-step framework: first, using LLMs to construct PDDL models from task descriptions, and second, utilizing these models within classical planning paradigms to devise feasible plans. This method contrasts with paradigms relying solely on LLMs for planning, which often result in suboptimal or incorrect plans, typically sensitive to overlooked physical constraints or long-term dependencies.
Central to this methodology is the extraction of symbolic action models in PDDL. The LLMs are leveraged to generate PDDL action schemas from descriptive inputs, iteratively refining these models through natural language feedback from both domain validators and human users. This approach minimizes human involvement in re-planning and optimizes the efficiency of correcting the domain models by focusing on initial model validation rather than continuous feedback during plan generation.
Empirical Evaluation
The efficacy of this framework was empirically validated across diverse domains, including two IPC benchmark domains and a complex household domain, which features constraints akin to real-world robotic applications. Notably, the result was the successful construction of high-quality PDDL models for over 40 actions using GPT-4, surpassing the noisy and error-prone outputs from GPT-3.5.
Quantitative metrics underscore the robustness of the approach: for the household domain alone, GPT-4 produced models with significantly fewer errors compared to GPT-3.5, facilitating the reliable execution of 48 planning tasks with minimal manual correction. Overcoming errors predominantly involved incorporating human-readable feedback once syntax validation was conducted through automated tools like the VAL system.
Implications and Future Directions
Practically, embedding world models derived from LLMs into planning systems leverages the knowledge embedded in these models to automate planning processes, significantly reducing the dependency on domain experts throughout the planning lifecycle. This approach points towards hybrid systems where LLM-inspired symbolic models interface seamlessly with domain-specific planners, promising improved reliability and efficiency.
Theoretically, the paper suggests that while LLMs are proficient at generating domain models, the complexity of planning tasks requires continued reliance on classical planners for solution generation. Thereby, LLMs should be positioned as facilitators for domain model extraction and validation rather than sole planners.
Future advances might explore scaling this framework to more complex and partially observable environments, extending its capabilities to deal with incomplete information synonymous with real-world scenarios. Additionally, enhancing LLMs to autonomously resolve logical inconsistencies within generated models, and to incorporate feedback more efficiently, remains a pivotal research challenge.
Overall, the research demonstrates a promising direction for augmenting artificial intelligence with LLMs, providing new insights and tools for AI planning while charting a course for more transparent, reliable, and efficient autonomous systems.