Agentic Skill Discovery (2405.15019v2)
Abstract: Language-conditioned robotic skills make it possible to apply the high-level reasoning of LLMs to low-level robotic control. A remaining challenge is to acquire a diverse set of fundamental skills. Existing approaches either manually decompose a complex task into atomic robotic actions in a top-down fashion, or bootstrap as many combinations as possible in a bottom-up fashion to cover a wider range of task possibilities. These decompositions or combinations, however, require an initial skill library. For example, a grasping'' capability can never emerge from a skill library containing only diverse
pushing'' skills. Existing skill discovery techniques with reinforcement learning acquire skills by an exhaustive exploration but often yield non-meaningful behaviors. In this study, we introduce a novel framework for skill discovery that is entirely driven by LLMs. The framework begins with an LLM generating task proposals based on the provided scene description and the robot's configurations, aiming to incrementally acquire new skills upon task completion. For each proposed task, a series of reinforcement learning processes are initiated, utilizing reward and success determination functions sampled by the LLM to develop the corresponding policy. The reliability and trustworthiness of learned behaviors are further ensured by an independent vision-LLM. We show that starting with zero skill, the skill library emerges and expands to more and more meaningful and reliable skills, enabling the robot to efficiently further propose and complete advanced tasks. Project page: \url{https://agentic-skill-discovery.github.io}.
- Do As I Can, Not As I Say: Grounding Language in Robotic Affordances. (arXiv:2204.01691):2204.01691, August 2022. doi: 10.48550/arXiv.2204.01691. URL http://arxiv.org/abs/2204.01691.
- Developmental Scaffolding with Large Language Models. In IEEE International Conference on Development and Learning, page 2309.00904. arXiv, September 2023. doi: 10.48550/arXiv.2309.00904. URL http://arxiv.org/abs/2309.00904.
- Guiding Pretraining in Reinforcement Learning with Large Language Models. arXiv preprint arXiv:2302.06692, page 2302.06692, 2023.
- Diversity is all you need: Learning skills without a reward function. In International Conference on Learning Representations, 2019. URL https://openreview.net/forum?id=SJx63jRqFm.
- Sim-to-Real Neural Learning with Domain Randomisation for Humanoid Robot Grasping. In Elias Pimenidis, Plamen Angelov, Chrisina Jayne, Antonios Papaleonidas, and Mehmet Aydin, editors, Artificial Neural Networks and Machine Learning – ICANN 2022, pages 342–354, Cham, 2022. Springer International Publishing. ISBN 978-3-031-15919-0. doi: 10.1007/978-3-031-15919-0_29.
- Variational intrinsic control. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Workshop Track Proceedings. OpenReview.net, 2017. URL https://openreview.net/forum?id=Skc-Fo4Yg.
- Reset-Free Reinforcement Learning via Multi-Task Learning: Learning Dexterous Manipulation Behaviors without Human Intervention. (arXiv:2104.11203):2104.11203, April 2021. doi: 10.48550/arXiv.2104.11203. URL http://arxiv.org/abs/2104.11203.
- Sim2Real in Robotics and Automation: Applications and Challenges. IEEE Transactions on Automation Science and Engineering, 18(2):398–400, April 2021. ISSN 1558-3783. doi: 10.1109/TASE.2021.3064065. URL https://ieeexplore.ieee.org/document/9398246.
- VoxPoser: Composable 3D value maps for robotic manipulation with language models. In 7th Annual Conference on Robot Learning, 2023. URL https://openreview.net/forum?id=9_8LF30mOC.
- Internally rewarded reinforcement learning. In Andreas Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, and Jonathan Scarlett, editors, Proceedings of the 40th International Conference on Machine Learning, volume 202 of Proceedings of Machine Learning Research, page 2302.00270. PMLR, July 2023. URL https://proceedings.mlr.press/v202/li23ax.html.
- Code as policies: Language model programs for embodied control. In Workshop on Language and Robotics at CoRL 2022, page 2209.07753, 2022. URL https://openreview.net/forum?id=fmtvpopfLC6.
- REFLECT: Summarizing Robot Experiences for Failure Explanation and Correction. (arXiv:2306.15724):2306.15724, October 2023. doi: 10.48550/arXiv.2306.15724. URL http://arxiv.org/abs/2306.15724.
- Eureka: Human-Level Reward Design via Coding Large Language Models. (arXiv:2310.12931):2310.12931, October 2023. doi: 10.48550/arXiv.2310.12931. URL http://arxiv.org/abs/2310.12931.
- RoCo: Dialectic Multi-Robot Collaboration with Large Language Models. arXiv preprint arXiv:2307.04738, page 2307.04738, 2023.
- Orbit: A Unified Simulation Framework for Interactive Robot Learning Environments. IEEE Robotics and Automation Letters, 8(6):2301.04195, June 2023. ISSN 2377-3766, 2377-3774. doi: 10.1109/LRA.2023.3270034. URL http://arxiv.org/abs/2301.04195.
- Policy invariance under reward transformations: Theory and application to reward shaping. In Icml, volume 99, pages 278–287, 1999.
- Controllability-aware unsupervised skill discovery. In arXiv Preprint arXiv:2302.05103, page 2302.05103, 2023.
- ASE: Large-scale reusable adversarial skill embeddings for physically simulated characters. ACM Trans. Graph., 41(4), July 2022. ISSN 0730-0301. doi: 10.1145/3528223.3530110. URL https://doi.org/10.1145/3528223.3530110.
- Proximal Policy Optimization Algorithms. CoRR, abs/1707.06347:1707.06347, 2017. URL http://arxiv.org/abs/1707.06347.
- Sequoia Capital. What’s next for AI agentic workflows ft. Andrew Ng of AI Fund, March 2024. URL https://www.youtube.com/watch?v=sal78ACtGTc.
- Dynamics-Aware Unsupervised Discovery of Skills. In International Conference on Learning Representations, 2020. URL https://openreview.net/forum?id=HJgLZR4KvH.
- Practices for Governing Agentic AI Systems. 2023. URL https://openai.com/index/practices-for-governing-agentic-ai-systems/.
- Voyager: An Open-Ended Embodied Agent with Large Language Models. (arXiv:2305.16291):2305.16291, October 2023. doi: 10.48550/arXiv.2305.16291. URL http://arxiv.org/abs/2305.16291.
- Tidybot: Personalized Robot Assistance with Large Language Models. 2023.
- Language to Rewards for Robotic Skill Synthesis. arXiv preprint arXiv:2306.08647, page 2306.08647, 2023.
- Bootstrap Your Own Skills: Learning to Solve New Tasks with Large Language Model Guidance. In 7th Annual Conference on Robot Learning, page 2310.10021. arXiv, October 2023. doi: 10.48550/arXiv.2310.10021. URL http://arxiv.org/abs/2310.10021.
- LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention. In The Twelfth International Conference on Learning Representations, page 2303.16199. arXiv, 2024. doi: 10.48550/arXiv.2303.16199. URL http://arxiv.org/abs/2303.16199.
- Chat with the environment: Interactive multimodal perception using large language models. In 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 3590–3596. IEEE, 2023. doi: 10.1109/IROS55552.2023.10342363.
- Xufeng Zhao (14 papers)
- Cornelius Weber (51 papers)
- Stefan Wermter (157 papers)