Generalized Planning in PDDL Domains with Pretrained Large Language Models (2305.11014v2)
Abstract: Recent work has considered whether LLMs can function as planners: given a task, generate a plan. We investigate whether LLMs can serve as generalized planners: given a domain and training tasks, generate a program that efficiently produces plans for other tasks in the domain. In particular, we consider PDDL domains and use GPT-4 to synthesize Python programs. We also consider (1) Chain-of-Thought (CoT) summarization, where the LLM is prompted to summarize the domain and propose a strategy in words before synthesizing the program; and (2) automated debugging, where the program is validated with respect to the training tasks, and in case of errors, the LLM is re-prompted with four types of feedback. We evaluate this approach in seven PDDL domains and compare it to four ablations and four baselines. Overall, we find that GPT-4 is a surprisingly powerful generalized planner. We also conclude that automated debugging is very important, that CoT summarization has non-uniform impact, that GPT-4 is far superior to GPT-3.5, and that just two training tasks are often sufficient for strong generalization.
- Do as I can, not as I say: Grounding language in robotic affordances. arXiv preprint arXiv:2204.01691.
- Syntax-guided synthesis. IEEE.
- Policies that generalize: Solving many planning problems with the same policy. In Twenty-Fourth International Joint Conference on Artificial Intelligence.
- Features, Projections, and Representation Change for Generalized Planning. CoRR, abs/1801.10055.
- Language models are few-shot learners. Advances in neural information processing systems, 33: 1877–1901.
- Sparks of artificial general intelligence: Early experiments with gpt-4. arXiv preprint arXiv:2303.12712.
- Chapman, D. 1987. Planning for Conjunctive Goals. Artificial Intelligence, 32: 333–377.
- Improving Code Generation by Training with Natural Language Feedback. arXiv:2303.16749.
- Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374.
- Teaching large language models to self-debug. arXiv preprint arXiv:2304.05128.
- Palm: Scaling language modeling with pathways. arXiv preprint arXiv:2204.02311.
- Structured, flexible, and robust: benchmarking and improving large language models towards more human-like behavior in out-of-distribution reasoning tasks. arXiv preprint arXiv:2205.05718.
- Inductive logic programming at 30: a new introduction. Journal of Artificial Intelligence Research, 74: 765–850.
- Learning and executing generalized robot plans. Artificial intelligence, 3: 251–288.
- PAL: Program-aided Language Models. arXiv preprint arXiv:2211.10435.
- Program synthesis. Foundations and Trends® in Programming Languages, 4(1-2): 1–119.
- Helmert, M. 2006. The fast downward planning system. Journal of Artificial Intelligence Research, 26: 191–246.
- VAL: Automatic plan validation, continuous effects and mixed initiative planning using PDDL. In 16th IEEE International Conference on Tools with Artificial Intelligence, 294–301. IEEE.
- Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents. In International Conference on Machine Learning (ICML).
- Inner Monologue: Embodied Reasoning through Planning with Language Models. arXiv preprint arXiv:2207.05608.
- MathPrompter: Mathematical Reasoning using Large Language Models. arXiv:2303.05398.
- Self-planning Code Generation with Large Language Model. arXiv preprint arXiv:2303.06689.
- Computing plans with control flow and procedures using a classical planner. In Proceedings of the Eighth Annual Symposium on Combinatorial Search, SOCS-15, 62–69.
- A review of generalized planning. The Knowledge Engineering Review, 34.
- Reshaping diverse planning. In Proceedings of the AAAI Conference on Artificial Intelligence, 06, 9892–9899.
- Levesque, H. 2005. Planning with Loops. In IJCAI.
- Learning action strategies for planning domains using genetic programming. In Workshops on Applications of Evolutionary Computation, 684–695. Springer.
- Code as policies: Language model programs for embodied control. arXiv preprint arXiv:2209.07753.
- On Grounded Planning for Embodied Tasks with Language Models. arXiv preprint arXiv:2209.00465.
- Text2motion: From natural language instructions to feasible plans. arXiv preprint arXiv:2303.12153.
- LLM+ P: Empowering Large Language Models with Optimal Planning Proficiency. arXiv preprint arXiv:2304.11477.
- McDermott, D. 2000. The 1998 AI Planning Systems Competition. AI Magazine, 21(2): 35–55.
- Muggleton, S. 1991. Inductive logic programming. New generation computing, 8: 295–318.
- CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis. arXiv:2203.13474.
- OpenAI. 2023. GPT-4 Technical Report. arXiv:2303.08774.
- Plansformer: Generating Symbolic Plans using Transformers. arXiv preprint arXiv:2212.08681.
- Planning with Large Language Models via Corrective Re-prompting. arXiv preprint arXiv:2211.09935.
- The LAMA planner: Guiding Cost-based Anytime Planning with Landmarks. Journal of Artificial Intelligence Research, 39: 127–177.
- Generalized Planning With Deep Reinforcement Learning. arXiv preprint arXiv:2005.02305.
- Universal value function approximators. In International conference on machine learning, 1312–1320. PMLR.
- Toolformer: Language models can teach themselves to use tools. arXiv preprint arXiv:2302.04761.
- Computing hierarchical finite state controllers with classical planning. Journal of Artificial Intelligence Research, 62: 755–797.
- Generalized Planning as Heuristic Search. In Proceedings of the International Conference on Automated Planning and Scheduling, volume 31, 569–577.
- Skill Induction and Planning with Latent Language. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 1713–1726.
- Predicate Invention for Bilevel Planning. In AAAI Conference on Artificial Intelligence (AAAI).
- PDDL Planning with Pretrained Large Language Models. In NeurIPS 2022 Foundation Models for Decision Making Workshop.
- Progprompt: Generating situated robot task plans using large language models. arXiv preprint arXiv:2209.11302.
- Finding diverse high-quality plans for hypothesis generation. In ECAI 2016, 1581–1582. IOS Press.
- Srivastava, S. 2011. Foundations and applications of generalized planning. AI Communications, 24(4): 349–351.
- Directed Search for Generalized Plans Using Classical Planners. In ICAPS.
- Horde: A scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction. In The 10th International Conference on Autonomous Agents and Multiagent Systems-Volume 2, 761–768.
- Takayuki, Y. 2000. On the NP-completeness of the Slither Link puzzle. IPSJ SIGNotes ALgorithms.
- Large Language Models Still Can’t Plan (A Benchmark for LLMs on Planning and Reasoning about Change). arXiv preprint arXiv:2206.10498.
- Chain of thought prompting elicits reasoning in large language models. arXiv preprint arXiv:2201.11903.
- Winner, E. Z. 2008. Learning Domain-Specific Planners from Example Plans. Ph.D. thesis, Carnegie Mellon University, USA.
- Conversational Automated Program Repair. arXiv:2301.13246.
- Translating natural language to planning goals with large-language models. arXiv preprint arXiv:2302.05128.
- PG3: Policy-Guided Planning for Generalized Policy Generation. In IJCAI.
- Outline, Then Details: Syntactically Guided Coarse-To-Fine Code Generation. arXiv preprint arXiv:2305.00909.
- Tom Silver (31 papers)
- Soham Dan (41 papers)
- Kavitha Srinivas (25 papers)
- Joshua B. Tenenbaum (257 papers)
- Leslie Pack Kaelbling (94 papers)
- Michael Katz (21 papers)