Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Embodied Active Learning of Relational State Abstractions for Bilevel Planning (2303.04912v2)

Published 8 Mar 2023 in cs.RO, cs.AI, and cs.LG

Abstract: State abstraction is an effective technique for planning in robotics environments with continuous states and actions, long task horizons, and sparse feedback. In object-oriented environments, predicates are a particularly useful form of state abstraction because of their compatibility with symbolic planners and their capacity for relational generalization. However, to plan with predicates, the agent must be able to interpret them in continuous environment states (i.e., ground the symbols). Manually programming predicate interpretations can be difficult, so we would instead like to learn them from data. We propose an embodied active learning paradigm where the agent learns predicate interpretations through online interaction with an expert. For example, after taking actions in a block stacking environment, the agent may ask the expert: "Is On(block1, block2) true?" From this experience, the agent learns to plan: it learns neural predicate interpretations, symbolic planning operators, and neural samplers that can be used for bilevel planning. During exploration, the agent plans to learn: it uses its current models to select actions towards generating informative expert queries. We learn predicate interpretations as ensembles of neural networks and use their entropy to measure the informativeness of potential queries. We evaluate this approach in three robotic environments and find that it consistently outperforms six baselines while exhibiting sample efficiency in two key metrics: number of environment interactions, and number of queries to the expert. Code: https://tinyurl.com/active-predicates

Definition Search Book Streamline Icon: https://streamlinehq.com
References (66)
  1. Near optimal behavior via approximate state abstraction. In International Conference on Machine Learning, pp. 2915–2923. PMLR, 2016.
  2. State abstractions for lifelong reinforcement learning. In International Conference on Machine Learning, pp.  10–19. PMLR, 2018.
  3. Deepsym: Deep symbol generation and rule learning for planning from unsupervised robot interaction. Journal of Artificial Intelligence Research, 75:709–745, 2022.
  4. Do as I can, not as I say: Grounding language in robotic affordances. arXiv preprint arXiv:2204.01691, 2022.
  5. Grounding language to autonomously-acquired skills via goal generation. In International Conference on Learning Representations, 2021. URL https://openreview.net/forum?id=chPj_I5KMHG.
  6. Pyperplan, 2020. URL https://doi.org/10.5281/zenodo.3700819.
  7. Active exploration for learning symbolic representations. Advances in neural information processing systems, 30, 2017.
  8. Batch active preference-based learning of reward functions. In Conference on robot learning, pp.  519–528. PMLR, 2018.
  9. Interactive perception: Leveraging action in perception and perception in action. IEEE Transactions on Robotics, 33(6):1273–1291, 2017.
  10. Language acquisition and conceptual development. Cambridge University Press, 2001.
  11. Do novel words facilitate 18-month-olds’ spatial categorization? Child development, 78(6):1818–1829, 2007.
  12. End-to-end incremental learning. In Proceedings of the European conference on computer vision (ECCV), pp.  233–248, 2018.
  13. Glib: Efficient exploration for relational model-based reinforcement learning via goal-literal babbling. In AAAI, 2021.
  14. Learning neuro-symbolic relational transition models for bilevel planning. In AAAI CLeaR Workshop, 2022.
  15. Curious: intrinsically motivated modular multi-goal reinforcement learning. In International conference on machine learning, pp. 1331–1340. PMLR, 2019.
  16. Active reward learning. In Proceedings of Robotics: Science and Systems, Berkeley, USA, July 2014. doi: 10.15607/RSS.2014.X.031.
  17. An object-oriented representation for efficient reinforcement learning. In Proceedings of the 25th international conference on Machine learning, pp.  240–247, 2008.
  18. Diversity is all you need: Learning skills without a reward function. In International Conference on Learning Representations, 2019. URL https://openreview.net/forum?id=SJx63jRqFm.
  19. Integrated task and motion planning, 2020.
  20. Generalizing plans to new environments in relational mdps. In Proceedings of the 18th international joint conference on Artificial intelligence, pp.  1003–1010, 2003.
  21. Stevan Harnad. The symbol grounding problem. Physica D: Nonlinear Phenomena, 42(1-3):335–346, jun 1990. doi: 10.1016/0167-2789(90)90087-6. URL https://doi.org/10.1016.
  22. Malte Helmert. The fast downward planning system. Journal of Artificial Intelligence Research, 26:191–246, 2006.
  23. Jörg Hoffmann. Ff: The fast-forward planning system. AI magazine, 22(3):57–57, 2001.
  24. Bayesian active learning for classification and preference learning. In NeurIPS Workshop on Bayesian optimization, experimental design and bandits: Theory and applications, 2011.
  25. Language models as zero-shot planners: Extracting actionable knowledge for embodied agents. In International Conference on Machine Learning (ICML), 2022.
  26. Learning to look around: Intelligently exploring unseen environments for unknown tasks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.  1238–1247, 2018.
  27. Learning grounded relational symbols from continuous data for abstract reasoning, 2013.
  28. Reinforcement learning: A survey. Journal of artificial intelligence research, 4:237–285, 1996.
  29. Near-optimal reinforcement learning in polynomial time. Machine learning, 49(2-3):209–232, 2002.
  30. Adam: A method for stochastic optimization, 2014. URL https://arxiv.org/abs/1412.6980.
  31. From skills to symbols: Learning symbolic representations for abstract high-level planning. Journal of Artificial Intelligence Research, 16:215–289, 2018.
  32. Efficient skill learning using abstraction selection. In IJCAI, volume 9, pp.  1107–1112, 2009.
  33. Active reinforcement learning: Observing rewards at a cost, 2020. URL https://arxiv.org/abs/2011.06709.
  34. Active learning for teaching a robot grounded relational symbols. In IJCAI, pp.  1451–1457. Citeseer, 2013.
  35. Planning for learning object properties. In AAAI Conference on Artificial Intelligence (AAAI), 2023.
  36. Exploration in relational domains for model-based reinforcement learning. J. Mach. Learn. Res., 13(1):3725–3768, dec 2012. ISSN 1532-4435.
  37. Towards a unified theory of state abstraction for mdps. In AI&M, 2006.
  38. Pre-trained language models for interactive decision-making. arXiv preprint arXiv:2202.01771, 2022.
  39. Reset-free lifelong learning with skill-space planning. arXiv preprint arXiv:2012.03548, 2020.
  40. Grounding predicates through actions. IEEE International Conference on Robotics and Automation (ICRA), 2022.
  41. Jun Hao Alvin Ng and Ronald P. A. Petrick. Incremental learning of planning actions in model-based reinforcement learning. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI-19, pp.  3195–3201. International Joint Conferences on Artificial Intelligence Organization, 7 2019. doi: 10.24963/ijcai.2019/443. URL https://doi.org/10.24963/ijcai.2019/443.
  42. Active learning of abstract plan feasibility. In Robotics: Science and Systems XVII. Robotics: Science and Systems Foundation, jul 2021. doi: 10.15607/rss.2021.xvii.043. URL https://doi.org/10.15607.
  43. Learning symbolic models of stochastic domains. Journal of Artificial Intelligence Research, 29:309–352, jul 2007. doi: 10.1613/jair.2113. URL https://doi.org/10.1613%2Fjair.2113.
  44. Self-supervised exploration via disagreement. In Kamalika Chaudhuri and Ruslan Salakhutdinov (eds.), Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pp. 5062–5071. PMLR, 09–15 Jun 2019. URL https://proceedings.mlr.press/v97/pathak19a.html.
  45. Active learning of relational action models. In International Conference on Inductive Logic Programming, pp.  302–316. Springer, 2011.
  46. Action learning and grounding in simulated human–robot interactions. The Knowledge Engineering Review, 34:e13, 2019. doi: 10.1017/S0269888919000079.
  47. Active preference-based learning of reward functions. In Robotics: Science and Systems. Robotics: Science and Systems Foundation, 2017.
  48. Active Reinforcement Learning with Monte-Carlo Tree Search. arXiv e-prints, art. arXiv:1803.04926, March 2018.
  49. Burr Settles. From theories to queries: Active learning in practice. In Active learning and experimental design workshop in conjunction with AISTATS 2010, pp.  1–18. JMLR Workshop and Conference Proceedings, 2011.
  50. Skill induction and planning with latent language. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp.  1713–1726, Dublin, Ireland, May 2022. Association for Computational Linguistics. doi: 10.18653/v1/2022.acl-long.120. URL https://aclanthology.org/2022.acl-long.120.
  51. Learning symbolic operators for task and motion planning. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2021.
  52. Learning neuro-symbolic skills for bilevel planning. In 6th Annual Conference on Robot Learning, 2022a. URL https://openreview.net/forum?id=OIaJRUo5UXy.
  53. Pddl planning with pretrained large language models. In NeurIPS 2022 Foundation Models for Decision Making Workshop, 2022b.
  54. Predicate invention for bilevel planning. In AAAI Conference on Artificial Intelligence (AAAI), 2023.
  55. Robots that use language. Annual Review of Control, Robotics, and Autonomous Systems, 3:25–55, 2020.
  56. A deep hierarchical approach to lifelong learning in minecraft. In Proceedings of the AAAI conference on artificial intelligence, volume 31, 2017.
  57. Opportunistic active learning for grounding natural language descriptions. In Conference on Robot Learning, pp.  67–76. PMLR, 2017.
  58. Sebastian Thrun. Lifelong Learning Algorithms, pp.  181–209. Springer US, Boston, MA, 1998. ISBN 978-1-4615-5529-2. doi: 10.1007/978-1-4615-5529-2˙8. URL https://doi.org/10.1007/978-1-4615-5529-2_8.
  59. On the planning abilities of large language models (a critical investigation with a proposed benchmark). arXiv preprint arXiv:2302.06706, 2023.
  60. Thomas J Walsh. Efficient learning of relational models for sequential decision making. PhD thesis, Rutgers University-Graduate School-New Brunswick, 2010.
  61. Generalizable task planning through representation pretraining. IEEE Robotics and Automation Letters, 7(3):8299–8306, 2022.
  62. Learning compositional models of robot skills for task and motion planning. The International Journal of Robotics Research, 40(6-7):866–894, 2021. doi: 10.1177/02783649211004615. URL https://doi.org/10.1177/02783649211004615.
  63. Model primitives for hierarchical lifelong reinforcement learning. Autonomous Agents and Multi-Agent Systems, 34:1–38, 2020.
  64. Deep affordance foresight: Planning through what can be done in the future, 2020. URL https://arxiv.org/abs/2011.08424.
  65. Visual curiosity: Learning to ask questions to learn visual recognition. arXiv preprint arXiv:1810.00912, 2018.
  66. Sornet: Spatial object-centric representations for sequential manipulation. In 5th Annual Conference on Robot Learning, pp.  148–157. PMLR, 2021.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Amber Li (2 papers)
  2. Tom Silver (31 papers)
Citations (5)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com