Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Learning Planning Abstractions from Language (2405.03864v1)

Published 6 May 2024 in cs.RO and cs.AI

Abstract: This paper presents a framework for learning state and action abstractions in sequential decision-making domains. Our framework, planning abstraction from language (PARL), utilizes language-annotated demonstrations to automatically discover a symbolic and abstract action space and induce a latent state abstraction based on it. PARL consists of three stages: 1) recovering object-level and action concepts, 2) learning state abstractions, abstract action feasibility, and transition models, and 3) applying low-level policies for abstract actions. During inference, given the task description, PARL first makes abstract action plans using the latent transition and feasibility functions, then refines the high-level plan using low-level policies. PARL generalizes across scenarios involving novel object instances and environments, unseen concept compositions, and tasks that require longer planning horizons than settings it is trained on.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (62)
  1. TAPS: Task-agnostic policy sequencing. In ICRA, 2023.
  2. Alignment-based compositional semantics for instruction following. In EMNLP, 2015.
  3. Modular multitask reinforcement learning with policy sketches. In ICML, 2017.
  4. Hindsight Experience Replay. In NeurIPS, 2017.
  5. Learning Neural-Symbolic Descriptive Planning Models via Cube-Space Priors: The Voyage Home (To STRIPS). arXiv:2004.12850, 2020.
  6. Recent advances in hierarchical reinforcement learning. Discrete event dynamic systems, 13(1):41–77, 2003.
  7. Learning First-Order Symbolic Representations for Planning from the Structure of the State Space. In ECAI, 2020.
  8. Learning and planning for temporally extended tasks in unknown environments, 2021.
  9. Decision Transformer: Reinforcement Learning via Sequence Modeling. In NeurIPS, 2021.
  10. BabyAI: First Steps Towards Grounded Language Learning With a Human In the Loop. In ICLR, 2019.
  11. Recurrent Environment Simulators. In ICLR, 2017.
  12. Learning Neuro-Symbolic Relational Transition Models for Bilevel Planning. arXiv:2105.14074, 2021.
  13. Modular networks for compositional instruction following. In NAACL-HLT, pp.  1033–1040, 2021.
  14. Pybullet, a python module for physics simulation in robotics, games and machine learning, 2017.
  15. Tomás de la Rosa and Sheila McIlraith. Learning domain control knowledge for tlplan and beyond. In ICAPS 2011 Workshop on Planning and Learning, 2011.
  16. Thomas G Dietterich. Hierarchical reinforcement learning with the maxq value function decomposition. JAIR, 13:227–303, 2000.
  17. Automated Planning: Theory and Practice. Elsevier, 2004.
  18. A Theory of Abstraction. Artif. Intell., 57(2-3):323–389, 1992.
  19. Equivalence Notions and Model Minimization in Markov Decision Processes. Artif. Intell., 147(1-2):163–223, 2003.
  20. Pct: Point cloud transformer. Computational Visual Media, 7:187–199, 2021.
  21. Latent space planning for multi-object manipulation with environment-aware relational classifiers. arXiv preprint arXiv:2305.10857, 2023.
  22. Learning Grounded Relational Symbols from Continuous Data for Abstract Reasoning. In ICRA Workshop, 2013.
  23. Language as an abstraction for hierarchical deep reinforcement learning. Advances in Neural Information Processing Systems, 32, 2019.
  24. Leslie Pack Kaelbling. Hierarchical Learning in Stochastic Domains: Preliminary Results. In ICML, 1993.
  25. Language-driven representation learning for robotics. In RSS, 2023.
  26. From Skills to Symbols: Learning Symbolic Representations for Abstract High-Level Planning. JAIR, 61:215–289, 2018.
  27. Towards a Unified Theory of State Abstraction for MDPs. In AI&M, 2006.
  28. Structformer: Learning spatial structure for language-guided semantic rearrangement of novel objects. In International Conference on Robotics and Automation, 2022.
  29. A survey of reinforcement learning informed by natural language. In IJCAI, 2019.
  30. Learning rational subgoals from demonstrations and instructions. In Proceedings of the AAAI Conference on Artificial Intelligence, 2023.
  31. Liv: Language-image representations and rewards for robotic control. In ICML, 2023.
  32. Pdsketch: Integrated domain programming, learning, and planning. Advances in Neural Information Processing Systems, 35:36972–36984, 2022.
  33. What matters in language conditioned robotic imitation learning over unstructured data. IEEE Robotics and Automation Letters, 7(4):11205–11212, 2022.
  34. Neville Mehta. Hierarchical structure discovery and transfer in sequential decision problems. Oregon State University, 2011.
  35. Listen, attend, and walk: Neural mapping of navigational instructions to action sequences. In Proceedings of the AAAI Conference on Artificial Intelligence, 2016.
  36. Mapping instructions and visual observations to actions with reinforcement learning. arXiv preprint arXiv:1704.08795, 2017.
  37. Asynchronous methods for deep reinforcement learning. In International conference on machine learning, 2016.
  38. Learning language-conditioned robot behavior from offline data and crowd-sourced annotation. In Conference on Robot Learning, pp.  1303–1315. PMLR, 2022.
  39. Learning macro-actions for arbitrary planners and domains. In ICAPS, 2007.
  40. Inferring task goals and constraints using bayesian nonparametric inverse reinforcement learning. In JMLR, 2020.
  41. Learning Symbolic Models of Stochastic Domains. JAIR, 29:309–352, 2007.
  42. FiLM: Visual Reasoning with a General Conditioning Layer. In AAAI, 2018.
  43. Learning transferable visual models from natural language supervision. In ICML, 2021.
  44. Aggregation and Disaggregation Techniques and Methodology in Optimization. Operations Research, 39(4):553–582, 1991.
  45. Object scene representation transformer. In Advances in Neural Information Processing Systems, 2022.
  46. Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model. Nat., 588(7839):604–609, 2020.
  47. Value function spaces: Skill-centric state abstractions for long-horizon reasoning. In ICLR, 2022.
  48. Skill induction and planning with latent language. arXiv preprint arXiv:2110.01517, 2021.
  49. Learning Symbolic Operators for Task and Motion Planning. In IROS, 2021.
  50. Program guided agent. In ICLR, 2020.
  51. Between MDPs and semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning. Artif. Intell., 112(1-2):181–211, 1999.
  52. Understanding natural language commands for robotic navigation and mobile manipulation. In AAAI, 2011.
  53. Robots that use language. Annual Review of Control, Robotics, and Autonomous Systems, 3:25–55, 2020.
  54. Teaching multiple tasks to an rl agent using ltl. In AAMAS, 2018.
  55. Generalizable task planning through representation pretraining. IEEE Robotics and Automation Letters, 7(3):8299–8306, 2022.
  56. Programmatically Grounded, Compositionally Generalizable Robotic Manipulation. In ICLR, 2023.
  57. Language-mediated, object-centric representation learning. In ACL Findings, 2021.
  58. Regression planning networks. Advances in Neural Information Processing Systems, 32, 2019.
  59. Sequence-Based Plan Feasibility Prediction for Efficient Task and Motion Planning. In RSS, 2023.
  60. Piglet: Language grounding through neuro-symbolic interaction in a 3d world. In ACL, 2021.
  61. Learning Invariant Representations for Reinforcement Learning without Reconstruction. In ICLR, 2021.
  62. Glipv2: Unifying localization and vision-language understanding. In Advances in Neural Information Processing Systems, 2022.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Weiyu Liu (22 papers)
  2. Geng Chen (115 papers)
  3. Joy Hsu (15 papers)
  4. Jiayuan Mao (55 papers)
  5. Jiajun Wu (249 papers)
Citations (1)

Summary

Exploring Abstractions in AI Planning through Language: A Look at PARL

Introduction

The manipulation of abstraction within AI, specifically for planning and learning, has lingered at the forefront of efficiency enhancements in robotics and related fields. Typically, this involves simplifying complex environments into more manageable entities in both state and action representations. In the context of AI, leveraging these abstractions helps an agent to decode and interact with environments in a computationally frugal way.

However, previous methodologies have often leaned on manually defining these abstracted "symbols", which can be labor-intensive and restrict the flexibility of the system. Recent advancements aimed to evolve this by learning these abstractions directly from data, and notably, through natural language inputs.

This blog post explores an innovative framework termed Planning Abstraction from Language (PARL) detailed in a paper. PARL automates the discovery of abstracted action spaces through language-annotated demonstrations, constructs a latent state abstraction, and hones these abstractions to effectively plan and interact within a given environment.

Breaking Down the PARL Framework

PARL's Core Stages:

  1. Symbol Discovery: PARL begins by analyzing language descriptions associated with demonstrations to extract what we call "action" and "object" concepts. These are essentially the building blocks of tasks that need to be performed.
  2. Abstract Model Training: Once symbols are isolated, the next stage is about establishing relationships — learning how actions transition between states in this simplified symbolic "world", gauging the feasibility of particular actions within certain states, and translating these abstract actions into actual controllable actions in the environment (like robotic movement).
  3. Plan Execution: With models in place, PARL can now propose sequences of abstract actions based on real-time observations, predict their outcomes, and calibrate the actions to fulfill given tasks, described in natural language.

Through these stages, PARL promotes a nuanced understanding and interaction with varied environments based purely on symbolic representations and abstracted instructions.

Practical Applications and Implications

The capabilities of PARL extend into areas where robust planning is essential:

  • Robotics: Especially in scenarios where discrete tasks need defining and executing in dynamic environments with precision, like in household robotics or manufacturing lines.
  • Gaming and Simulations: Where characters or agents need to navigate through complex set-ups or storylines by understanding and following abstracted commands.

Practically, what makes PARL especially influential is its ability to generalize this understanding to new, unseen scenarios — say, novel objects or unexpected states not covered in its training data. This aspect becomes critical where variability is frequent or unpredictable.

Theoretical Contributions and Future Prospects

The underlying power of PARL lies in its capability to automate the extraction of high-level abstractions from descriptive language, a feature that both eases the training process and enhances the adaptability of the system across varying tasks and environments. It stands out by enabling a form of "planning by abstraction", supporting faster and more flexible decision-making processes.

As future directions, enhancements could revolve around improving the initialization and segmentation of actions within its input—possibly leveraging unsupervised learning to further free up dependencies on curated data inputs. Moreover, integrating more advanced pre-trained models for object recognition can broaden its applicability to more diverse scenarios, promoting even broader generalization capabilities.

In conclusion, PARL represents a significant step toward embodying language understanding in practical planning and decision-making tasks within artificial intelligence. Its ability to break down and utilize language-based instructions not only streamlines the planning process but also stands as a fertile ground for future explorations into autonomous agent training and operations.