Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Words into Action: Learning Diverse Humanoid Robot Behaviors using Language Guided Iterative Motion Refinement (2310.06226v1)

Published 10 Oct 2023 in cs.RO

Abstract: Humanoid robots are well suited for human habitats due to their morphological similarity, but developing controllers for them is a challenging task that involves multiple sub-problems, such as control, planning and perception. In this paper, we introduce a method to simplify controller design by enabling users to train and fine-tune robot control policies using natural language commands. We first learn a neural network policy that generates behaviors given a natural language command, such as "walk forward", by combining LLMs, motion retargeting, and motion imitation. Based on the synthesized motion, we iteratively fine-tune by updating the text prompt and querying LLMs to find the best checkpoint associated with the closest motion in history. We validate our approach using a simulated Digit humanoid robot and demonstrate learning of diverse motions, such as walking, hopping, and kicking, without the burden of complex reward engineering. In addition, we show that our iterative refinement enables us to learn 3x times faster than a naive formulation that learns from scratch.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (39)
  1. Optimization and stabilization of trajectories for constrained dynamical systems. In 2016 IEEE International Conference on Robotics and Automation (ICRA), pages 1366–1373. IEEE, 2016.
  2. An approach to combine balancing with hierarchical whole-body control for legged humanoid robots. IEEE Robotics and Automation Letters, 1(2):700–707, 2015.
  3. Amp: Adversarial motion priors for stylized physics-based character control. 40(4), jul 2021. ISSN 0730-0301. doi:10.1145/3450626.3459670. URL https://doi.org/10.1145/3450626.3459670.
  4. Deepmimic: Example-guided deep reinforcement learning of physics-based character skills. ACM Transactions On Graphics (TOG), 37(4):1–14, 2018.
  5. Adversarial motion priors make good substitutes for complex reward functions. arXiv preprint arXiv:2203.15103, 2022.
  6. Learning a single policy for diverse behaviors on a quadrupedal robot using scalable motion imitation. arXiv preprint arXiv:2303.15331, 2023.
  7. Models, feedback control, and open problems of 3d bipedal robotic walking. Automatica, 50(8):1955–1988, 2014.
  8. Modeling, stability and control of biped robots—a general framework. Automatica, 40(10):1647–1664, 2004.
  9. M. H. Raibert. Legged robots. Communications of the ACM, 29(6):499–514, 1986.
  10. Biped walking pattern generation by using preview control of zero-moment point. In 2003 IEEE international conference on robotics and automation (Cat. No. 03CH37422), volume 2, pages 1620–1626. IEEE, 2003.
  11. Development of a bipedal humanoid robot-control method of whole body cooperative dynamic biped walking. In Proceedings 1999 IEEE International Conference on Robotics and Automation (Cat. No. 99CH36288C), volume 1, pages 368–374. IEEE, 1999.
  12. Real-time humanoid motion generation through zmp manipulation based on inverted pendulum control. In Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No. 02CH37292), volume 2, pages 1404–1409. IEEE, 2002.
  13. The development of honda humanoid robot. In Proceedings. 1998 IEEE international conference on robotics and automation (Cat. No. 98CH36146), volume 2, pages 1321–1326. IEEE, 1998.
  14. Whole-body motion planning with centroidal dynamics and full kinematics. In 2014 IEEE-RAS International Conference on Humanoid Robots, pages 295–302. IEEE, 2014.
  15. Reinforcement learning: An introduction. MIT press, 2018.
  16. Sim-to-real: Learning agile locomotion for quadruped robots. arXiv preprint arXiv:1804.10332, 2018.
  17. Learning quadrupedal locomotion over challenging terrain. Science robotics, 5(47):eabc5986, 2020.
  18. Cascaded compositional residual learning for complex interactive behaviors. IEEE Robotics and Automation Letters, 2023.
  19. Reinforcement learning for robust parameterized locomotion control of bipedal robots. In 2021 IEEE International Conference on Robotics and Automation (ICRA), pages 2811–2817. IEEE, 2021.
  20. Adapting rapid motor adaptation for bipedal robots. In 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 1161–1168. IEEE, 2022.
  21. Reinforcement learning-based cascade motion policy design for robust 3d bipedal locomotion. IEEE Access, 10:20135–20148, 2022.
  22. Linear policies are sufficient to realize robust bipedal walking on challenging terrains. IEEE Robotics and Automation Letters, 7(2):2047–2054, 2022.
  23. Learning humanoid locomotion with transformers. arXiv preprint arXiv:2303.03381, 2023.
  24. Decentralized motor skill learning for complex robotic systems. arXiv preprint arXiv:2306.17411, 2023.
  25. Isaac gym: High performance gpu-based physics simulation for robot learning. arXiv preprint arXiv:2108.10470, 2021.
  26. Learning agile robotic locomotion skills by imitating animals. arXiv preprint arXiv:2004.00784, 2020.
  27. Interactive control of avatars animated with human motion data. In Proceedings of the 29th annual conference on Computer graphics and interactive techniques, pages 491–500, 2002.
  28. Generating diverse and natural 3d human motions from text. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5152–5161, 2022.
  29. T2m-gpt: Generating human motion from textual descriptions with discrete representations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
  30. Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374, 2021.
  31. Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems, 35:24824–24837, 2022.
  32. Large language models are zero-shot reasoners. Advances in neural information processing systems, 35:22199–22213, 2022.
  33. Progprompt: Generating situated robot task plans using large language models. In 2023 IEEE International Conference on Robotics and Automation (ICRA), pages 11523–11530. IEEE, 2023.
  34. Large language models are human-level prompt engineers. arXiv preprint arXiv:2211.01910, 2022.
  35. Do as i can, not as i say: Grounding language in robotic affordances. arXiv preprint arXiv:2204.01691, 2022.
  36. Palm-e: An embodied multimodal language model. arXiv preprint arXiv:2303.03378, 2023.
  37. Explaining agent behavior with large language models. arXiv preprint arXiv:2309.10346, 2023.
  38. J. Ho and S. Ermon. Generative adversarial imitation learning. Advances in neural information processing systems, 29, 2016.
  39. Proximal Policy Optimization Algorithms. arXiv preprint arXiv:1707.06347, 2017.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. K. Niranjan Kumar (5 papers)
  2. Irfan Essa (91 papers)
  3. Sehoon Ha (60 papers)
Citations (6)

Summary

We haven't generated a summary for this paper yet.