Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
175 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Whole-body Humanoid Robot Locomotion with Human Reference (2402.18294v4)

Published 28 Feb 2024 in cs.RO

Abstract: Recently, humanoid robots have made significant advances in their ability to perform challenging tasks due to the deployment of Reinforcement Learning (RL), however, the inherent complexity of humanoid robots, including the difficulty of designing complicated reward functions and training entire sophisticated systems, still poses a notable challenge. To conquer these challenges, after many iterations and in-depth investigations, we have meticulously developed a full-size humanoid robot, "Adam", whose innovative structural design greatly improves the efficiency and effectiveness of the imitation learning process. In addition, we have developed a novel imitation learning framework based on an adversarial motion prior, which applies not only to Adam but also to humanoid robots in general. Using the framework, Adam can exhibit unprecedented human-like characteristics in locomotion tasks. Our experimental results demonstrate that the proposed framework enables Adam to achieve human-comparable performance in complex locomotion tasks, marking the first time that human locomotion data has been used for imitation learning in a full-size humanoid robot.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (37)
  1. Wikipedia, “Optimus (robot),” https://en.wikipedia.org/wiki/Optimus˙(robot), 2024, accessed: 2024-02-28.
  2. Y. Gong, R. Hartley, X. Da, A. Hereid, O. Harib, J.-K. Huang, and J. Grizzle, “Feedback control of a cassie bipedal robot: Walking, standing, and riding a segway,” in 2019 American Control Conference (ACC).   IEEE, 2019, pp. 4559–4566.
  3. Z. Li, X. B. Peng, P. Abbeel, S. Levine, G. Berseth, and K. Sreenath, “Reinforcement learning for versatile, dynamic, and robust bipedal locomotion control,” arXiv preprint arXiv:2401.16889, 2024.
  4. Z. Li, X. Cheng, X. B. Peng, P. Abbeel, S. Levine, G. Berseth, and K. Sreenath, “Reinforcement learning for robust parameterized locomotion control of bipedal robots,” in IEEE International Conference on Robotics and Automation (ICRA), Xi’an, China, June 2021.
  5. C. E. Garcia, D. M. Prett, and M. Morari, “Model predictive control: Theory and practice—a survey,” Automatica, vol. 25, no. 3, pp. 335–348, 1989.
  6. N. Rudin, D. Hoeller, P. Reist, and M. Hutter, “Learning to walk in minutes using massively parallel deep reinforcement learning,” in Conference on Robot Learning.   PMLR, 2022, pp. 91–100.
  7. Z. Li, X. B. Peng, P. Abbeel, S. Levine, G. Berseth, and K. Sreenath, “Robust and versatile bipedal jumping control through reinforcement learning.”
  8. J. Siekmann, K. Green, J. Warila, A. Fern, and J. Hurst, “Blind bipedal stair traversal via sim-to-real reinforcement learning,” arXiv preprint arXiv:2105.08328, 2021.
  9. J. Siekmann, Y. Godse, A. Fern, and J. Hurst, “Sim-to-real learning of all common bipedal gaits via periodic reward composition,” in 2021 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2021, pp. 7309–7315.
  10. D. Crowley, J. Dao, H. Duan, K. Green, J. Hurst, and A. Fern, “Optimizing bipedal locomotion for the 100m dash with comparison to human running,” in 2023 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2023, pp. 12 205–12 211.
  11. S. H. Jeon, S. Heim, C. Khazoom, and S. Kim, “Benchmarking potential based rewards for learning humanoid locomotion,” in 2023 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2023, pp. 9204–9210.
  12. F. Shi, Y. Kojio, T. Makabe, T. Anzai, K. Kojima, K. Okada, and M. Inaba, “Reference-free learning bipedal motor skills via assistive force curricula,” in The International Symposium of Robotics Research.   Springer, 2022, pp. 304–320.
  13. R. P. Singh, Z. Xie, P. Gergondet, and F. Kanehiro, “Learning bipedal walking for humanoids with current feedback,” IEEE Access, 2023.
  14. D. Kim, G. Berseth, M. Schwartz, and J. Park, “Torque-based deep reinforcement learning for task-and-robot agnostic learning on bipedal robots using sim-to-real transfer,” arXiv preprint arXiv:2304.09434, 2023.
  15. T. Haarnoja, B. Moran, G. Lever, S. H. Huang, D. Tirumala, M. Wulfmeier, J. Humplik, S. Tunyasuvunakool, N. Y. Siegel, R. Hafner, et al., “Learning agile soccer skills for a bipedal robot with deep reinforcement learning,” arXiv preprint arXiv:2304.13653, 2023.
  16. I. Radosavovic, T. Xiao, B. Zhang, T. Darrell, J. Malik, and K. Sreenath, “Learning humanoid locomotion with transformers,” arXiv preprint arXiv:2303.03381, 2023.
  17. A. Tang, T. Hiraoka, N. Hiraoka, F. Shi, K. Kawaharazuka, K. Kojima, K. Okada, and M. Inaba, “Humanmimic: Learning natural locomotion and transitions for humanoid robot via wasserstein adversarial imitation,” arXiv preprint arXiv:2309.14225, 2023.
  18. T. Osa, J. Pajarinen, G. Neumann, J. A. Bagnell, P. Abbeel, J. Peters, et al., “An algorithmic perspective on imitation learning,” Foundations and Trends® in Robotics, vol. 7, no. 1-2, pp. 1–179, 2018.
  19. H. Ravichandar, A. S. Polydoros, S. Chernova, and A. Billard, “Recent advances in robot learning from demonstration,” Annual review of control, robotics, and autonomous systems, vol. 3, pp. 297–330, 2020.
  20. S. Bohez, S. Tunyasuvunakool, P. Brakel, F. Sadeghi, L. Hasenclever, Y. Tassa, E. Parisotto, J. Humplik, T. Haarnoja, R. Hafner, et al., “Imitate and repurpose: Learning reusable robot movement skills from human and animal behaviors,” arXiv preprint arXiv:2203.17138, 2022.
  21. L. Han, Q. Zhu, J. Sheng, C. Zhang, T. Li, Y. Zhang, H. Zhang, Y. Liu, C. Zhou, R. Zhao, et al., “Lifelike agility and play on quadrupedal robots using reinforcement learning and generative pre-trained models,” arXiv preprint arXiv:2308.15143, 2023.
  22. S. Schaal, “Is imitation learning the route to humanoid robots?” Trends in cognitive sciences, vol. 3, no. 6, pp. 233–242, 1999.
  23. J. Van Den Berg, S. Miller, D. Duckworth, H. Hu, A. Wan, X.-Y. Fu, K. Goldberg, and P. Abbeel, “Superhuman performance of surgical tasks by robots using iterative learning from human-guided demonstrations,” in 2010 IEEE International Conference on Robotics and Automation.   IEEE, 2010, pp. 2074–2081.
  24. P. M. Kebria, A. Khosravi, S. M. Salaken, and S. Nahavandi, “Deep imitation learning for autonomous vehicles based on convolutional neural networks,” IEEE/CAA Journal of Automatica Sinica, vol. 7, no. 1, pp. 82–95, 2020.
  25. L. Le Mero, D. Yi, M. Dianati, and A. Mouzakitis, “A survey on imitation learning techniques for end-to-end autonomous vehicles,” IEEE Transactions on Intelligent Transportation Systems, 2022.
  26. J. Ho and S. Ermon, “Generative adversarial imitation learning,” Advances in neural information processing systems, vol. 29, 2016.
  27. X. B. Peng, Z. Ma, P. Abbeel, S. Levine, and A. Kanazawa, “Amp: Adversarial motion priors for stylized physics-based character control,” ACM Transactions on Graphics (ToG), vol. 40, no. 4, pp. 1–20, 2021.
  28. X. B. Peng, Y. Guo, L. Halper, S. Levine, and S. Fidler, “Ase: Large-scale reusable adversarial skill embeddings for physically simulated characters,” ACM Transactions On Graphics (TOG), vol. 41, no. 4, pp. 1–17, 2022.
  29. A. Escontrela, X. B. Peng, W. Yu, T. Zhang, A. Iscen, K. Goldberg, and P. Abbeel, “Adversarial motion priors make good substitutes for complex reward functions,” in 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).   IEEE, 2022, pp. 25–32.
  30. E. Vollenweider, M. Bjelonic, V. Klemm, N. Rudin, J. Lee, and M. Hutter, “Advanced skills through multiple adversarial motion priors in reinforcement learning,” in 2023 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2023, pp. 5120–5126.
  31. C. Li, M. Vlastelica, S. Blaes, J. Frey, F. Grimminger, and G. Martius, “Learning agile skills via adversarial imitation of rough partial demonstrations,” in Conference on Robot Learning.   PMLR, 2023, pp. 342–352.
  32. J. Wu, G. Xin, C. Qi, and Y. Xue, “Learning robust and agile legged locomotion using adversarial motion priors,” IEEE Robotics and Automation Letters, 2023.
  33. Y. Wang, Z. Jiang, and J. Chen, “Amp in the wild: Learning robust, agile, natural legged locomotion skills,” arXiv preprint arXiv:2304.10888, 2023.
  34. K. Ayusawa and E. Yoshida, “Motion retargeting for humanoid robots based on simultaneous morphing parameter identification and motion optimization,” IEEE Transactions on Robotics, vol. 33, no. 6, pp. 1343–1357, 2017.
  35. R. Grandia, F. Farshidian, E. Knoop, C. Schumacher, M. Hutter, and M. Bächer, “Doc: Differentiable optimal control for retargeting motions onto legged robots,” ACM Transactions on Graphics (TOG), vol. 42, no. 4, pp. 1–14, 2023.
  36. J. H. Park, “Impedance control for biped robot locomotion,” IEEE Transactions on Robotics and Automation, vol. 17, no. 6, pp. 870–882, 2001.
  37. J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algorithms,” arXiv preprint arXiv:1707.06347, 2017.
Citations (10)

Summary

  • The paper introduces a novel imitation learning framework that simplifies reward design by leveraging human locomotion data.
  • The paper empirically validates robot Adam’s robust human-like gait and adaptability through both simulation and real-world experiments.
  • The paper leverages adversarial motion priors to bridge the Sim2Real gap, enhancing locomotion performance on complex terrains.

Overview of "Whole-body Humanoid Robot Locomotion with Human Reference"

The paper "Whole-body Humanoid Robot Locomotion with Human Reference" presents a comprehensive paper on enhancing the locomotion capabilities of humanoid robots through an innovative imitation learning framework. The researchers focus on the development of a full-size humanoid robot named "Adam," built to mimic human motion with a high degree of accuracy. This paper introduces a methodologically novel approach leveraging human locomotion data for reinforcement learning to achieve human-comparable locomotion in humanoid robots.

While the deployment of Reinforcement Learning (RL) has significantly advanced the field of humanoid robotics, traditional methods often face difficulties due to the complex design requirements of reward functions and system training. This research addresses these challenges by employing an imitation learning framework based on adversarial motion priors. The paper underscores the adaptability and human-like motion execution by Adam, demonstrating successful locomotion tasks on both familiar and novel terrains.

Key Contributions

  1. Biomimetic Design: The paper discusses the design and development of the humanoid robot Adam, which exhibits a highly biomimetic structure. With limbs capable of human-like range of motion, Adam introduces significant advantages in terms of cost-effectiveness and ease of maintenance, commonly challenging aspects of traditional humanoid robots.
  2. Imitation Learning Framework: A core contribution of this research is a novel whole-body imitation learning framework which simplifies the design of complex RL reward functions. By integrating human locomotion data, the framework reduces the Sim2Real gap and significantly enhances the learning capability and adaptability of humanoid robots.
  3. Empirical Validation: The experimental validation shows Adam achieving robust human-like performance across various complex motion tasks. The paper claims to be the first to leverage human locomotion data in this capacity, marking a significant practical demonstration and providing a new perspective for the future development of humanoid robots.

Results and Implications

The experimental results indicate remarkable performance in terms of locomotion efficiency and adaptability. The robot Adam is stated to achieve unprecedented characteristics in locomotion tasks, such as the "heel-to-toe" walking pattern and human-like gait adaptability, which are critical for real-world applications of humanoid robots. The paper benchmarks its methodologies by cross-validating with both simulation (Isaac Gym and Webots) and real-world experiments, demonstrating the robustness of their approach.

The introduction of an adversarial strategy in imitation learning helps bridge the gap between human locomotion data and robot control, thus facilitating more refined motion styles that closely resemble human movement. Consequently, this research broadens the horizon for humanoid robotics, offering pathways to programming complex autonomy and flexibility into machines, which remains a vital parameter for robots intended for interaction in human environments.

Future Directions

Further exploration is suggested in integrating additional sensory inputs and perceptual modules to enable Adam to adapt its motion in response to complex and dynamic environments. This adjustment could further solidify the practicality of humanoid robots in various application domains, including rescue operations, domestic assistance, and industrial tasks. Given the versatility of this learning model, potential pathways include expanding the dataset of human motion to cover a wider range of activities, thus refining the robot's ability to mimic complex real-world tasks.

In essence, the paper offers a well-grounded approach to addressing some of the existing limitations in humanoid robotics, possibly catalyzing further advancements in the field through its methodology and experimental affirmations. This work significantly contributes to the ongoing quest for harmonious integration of physical robots with intricate, human-like capabilities into everyday human activities.

Youtube Logo Streamline Icon: https://streamlinehq.com