Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

InterPreT: Interactive Predicate Learning from Language Feedback for Generalizable Task Planning (2405.19758v1)

Published 30 May 2024 in cs.RO

Abstract: Learning abstract state representations and knowledge is crucial for long-horizon robot planning. We present InterPreT, an LLM-powered framework for robots to learn symbolic predicates from language feedback of human non-experts during embodied interaction. The learned predicates provide relational abstractions of the environment state, facilitating the learning of symbolic operators that capture action preconditions and effects. By compiling the learned predicates and operators into a PDDL domain on-the-fly, InterPreT allows effective planning toward arbitrary in-domain goals using a PDDL planner. In both simulated and real-world robot manipulation domains, we demonstrate that InterPreT reliably uncovers the key predicates and operators governing the environment dynamics. Although learned from simple training tasks, these predicates and operators exhibit strong generalization to novel tasks with significantly higher complexity. In the most challenging generalization setting, InterPreT attains success rates of 73% in simulation and 40% in the real world, substantially outperforming baseline methods.

Essay on "InterPreT: Interactive Predicate Learning from Language Feedback for Generalizable Task Planning"

The paper introduces InterPreT, an innovative framework leveraging LLMs to enable robots to learn symbolic predicates through interactions with human non-experts. It addresses a fundamental problem in robotics: effective long-horizon task planning using abstract state representations. InterPreT utilizes interactive language feedback to develop symbolic representations that aid in the robot's task planning capabilities. These representations are encapsulated in symbolic predicates that map intricate environment states into high-level symbolic relations, subsequently facilitating the learning of symbolic operators essential for depicting action dynamics.

Core Contributions and Methodology

The pivotal contribution of this paper is the integration of LLMs, specifically GPT-4, to derive predicates and operators that form the backbone of symbolic task planning. The predicates learned serve as a bridge for converting concrete state observations into symbolic forms, enabling efficient planning with Planning Domain Definition Language (PDDL) planners.

InterPreT distinguishes itself by synthesizing language feedback into predicate functions using Python, exploiting the semantic understanding embedded within LLMs. This is operationalized through three main modules: the Reasoner, which extracts task-relevant information and identifies novel predicates; the Coder, which constructs the predicate functions; and the Corrector, which iteratively refines these functions. Through these modules, InterPreT ensures that the learned predicates align with the underlying semantic structures required for robust task planning.

The framework is further validated through experimentation in simulated environments and real-world robotics tasks, demonstrating its ability to generalize learned predicates across novel tasks of increased complexity. In these contexts, InterPreT significantly outperforms baseline methods, achieving a success rate of up to 73% in the most challenging test settings. These experiments underline the method's capacity to discover task-relevant predicates despite minimal training data, showcasing its robustness and efficacy.

Results and Implications

The empirical results from the InterPreT framework illustrate a significant improvement in symbolic task planning's efficiency and effectiveness compared to existing LLM-based planners. The learned predicates and operators facilitate robust planning and generalization, an advancement over traditional symbolic planners which often rely on manually designed predicates and operators. Furthermore, utilizing real-world language feedback renders the training process more rapid and effective, closely mirroring human learning paradigms.

These findings have crucial ramifications for future research in AI and robotics. By more deeply integrating human-centric interaction mechanisms with robotic planning, InterPreT sets a stage for creating adaptive systems capable of seamless task execution in dynamic and uncertain environments. The paper suggests an inception point for more encompassing frameworks that integrate robust sensory processing and advanced symbolic reasoning in pursuit of autonomous systems with human-like understanding and planning capabilities.

Conclusion and Future Directions

InterPreT represents a substantial step forward in interactive and interpretable AI systems. By successfully leveraging LLMs for symbolic predicate learning, it bridges the gap between complex continuous observations and symbolic reasoning required for high-level task execution. Although the framework primarily focuses on deterministic planning domains, future iterations could expand towards probabilistic task settings and Task and Motion Planning (TAMP) spaces, making robots more adaptable to real-world uncertainties and variances. Moreover, the research highlights the potential for bootstrapping learning processes in multi-domain environments using previously acquired knowledge, paving the way for progressively intelligent and autonomous AI systems.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Muzhi Han (8 papers)
  2. Yifeng Zhu (21 papers)
  3. Song-Chun Zhu (216 papers)
  4. Ying Nian Wu (138 papers)
  5. Yuke Zhu (134 papers)
Citations (13)
X Twitter Logo Streamline Icon: https://streamlinehq.com
Youtube Logo Streamline Icon: https://streamlinehq.com