Learning adaptive planning representations with natural language guidance (2312.08566v1)

Published 13 Dec 2023 in cs.AI, cs.CL, and cs.RO

Abstract: Effective planning in the real world requires not only world knowledge, but the ability to leverage that knowledge to build the right representation of the task at hand. Decades of hierarchical planning techniques have used domain-specific temporal action abstractions to support efficient and accurate planning, almost always relying on human priors and domain knowledge to decompose hard tasks into smaller subproblems appropriate for a goal or set of goals. This paper describes Ada (Action Domain Acquisition), a framework for automatically constructing task-specific planning representations using task-general background knowledge from LLMs (LMs). Starting with a general-purpose hierarchical planner and a low-level goal-conditioned policy, Ada interactively learns a library of planner-compatible high-level action abstractions and low-level controllers adapted to a particular domain of planning tasks. On two language-guided interactive planning benchmarks (Mini Minecraft and ALFRED Household Tasks), Ada strongly outperforms other approaches that use LMs for sequential decision-making, offering more accurate plans and better generalization to complex tasks.

References (50)

Citations (13)

View on Semantic Scholar

Summary

The paper introduces the Ada framework that automatically constructs symbolic planning operators and low-level controllers using language models for task-specific planning.
It demonstrates bi-level planning where high-level plans built as PDDL operators are refined by learned low-level control policies in interactive environments.
Experimental results on Mini Minecraft and ALFRED benchmarks highlight significant improvements in planning efficiency and operator generalization.

This paper presents the Action Domain Acquisition (Ada) framework, which addresses the challenge of constructing effective task-specific planning representations using task-general background knowledge from LLMs (LMs). The primary goal is to enable automated construction of hierarchical task domains that support efficient and accurate planning without relying extensively on human-engineered priors. Ada constructs these domain-like abstractions by learning a library of planner-compatible high-level action definitions, leveraging interactive environments.

Key Contributions:

Adaptive Representation Learning: Ada automatically constructs symbolic planning operators that specify preconditions and effects for actions, facilitating traditional symbolic planning. Simultaneously, it learns local controllers that can execute these operators via low-level actions in interactive environments.
Hierarchical Action Spaces: Ada leverages LMs to extract potential high-level actions in the format of Planning Domain Definition Language (PDDL) operators and uses these to build a compositional library that can adapt to varied planning tasks.
Bi-Level Planning: Once Ada has defined a library of actions, it employs a hierarchical approach to planning. High-level plans are constructed using symbolic operators, and low-level control policies are learned or refined to achieve subgoals imposed by these high-level plans.
Task Environment and Language Benchmarking: Evaluations on language-guided benchmarks, Mini Minecraft and ALFRED, demonstrate Ada’s effectiveness in adapting generalized planning strategies to specific goals expressed in natural language, showing substantial improvements over existing methods of language-model driven planning.

Experimental Results:

Mini Minecraft and ALFRED Benchmarks: Ada achieves superior performance compared to baseline methods such as low-level planning-only (e.g., direct goal translation), subgoal sequence prediction, and code-based policy prediction across both simple and complex planning tasks.
The approach demonstrated 100% accuracy in successfully mining and crafting tasks in Mini Minecraft, achieving compositional tasks up to a complexity level of 26 steps. In ALFRED, Ada achieved a task completion rate of 79%, a significant advancement over baselines which struggled to exceed 21%.
Operator Generalization and Planning Efficiency: The framework shows that LLM-driven operators enable the generalization of learned actions to solve previously unseen tasks, effectively handling ambiguity and underspecification in human language instructions.

Challenges and Future Directions:

The framework currently relies on a predefined set of high-level predicates for initial state representation, limiting adaptability in the absence of pre-specified domain knowledge.
Expanding the role of multimodal LLMs may address challenges related to perceptual input integration and improve the handling of geometric or fine-grained motor tasks.

In conclusion, Ada leverages LMs to transition from generalized language knowledge to functional domain-specific planning representations, significantly advancing the capabilities of AI systems in constructing scalable and adaptive planning strategies from linguistic specifications. Importantly, the approach highlights the potential synergy between structured symbolic planning frameworks and the rich background knowledge encapsulated within modern LMs.

PDF Markdown

Related Papers

Tweets

https://twitter.com/4654501943/status/1738034698494890069

https://twitter.com/2210861/status/1737949877307641923

https://twitter.com/22146921/status/1739040583262646423

https://twitter.com/1673880500920827905/status/1736447645281923386

https://twitter.com/carlosrof/status/1790223783833321601