Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
156 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

HumanoidBench: Simulated Humanoid Benchmark for Whole-Body Locomotion and Manipulation (2403.10506v2)

Published 15 Mar 2024 in cs.RO, cs.AI, and cs.LG

Abstract: Humanoid robots hold great promise in assisting humans in diverse environments and tasks, due to their flexibility and adaptability leveraging human-like morphology. However, research in humanoid robots is often bottlenecked by the costly and fragile hardware setups. To accelerate algorithmic research in humanoid robots, we present a high-dimensional, simulated robot learning benchmark, HumanoidBench, featuring a humanoid robot equipped with dexterous hands and a variety of challenging whole-body manipulation and locomotion tasks. Our findings reveal that state-of-the-art reinforcement learning algorithms struggle with most tasks, whereas a hierarchical learning approach achieves superior performance when supported by robust low-level policies, such as walking or reaching. With HumanoidBench, we provide the robotics community with a platform to identify the challenges arising when solving diverse tasks with humanoid robots, facilitating prompt verification of algorithms and ideas. The open-source code is available at https://humanoid-bench.github.io.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (63)
  1. Exploring kinodynamic fabrics for reactive whole-body control of underactuated humanoid robots. In IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 10397–10404. IEEE, 2023.
  2. Solving rubik’s cube with a robot hand. arXiv preprint arXiv:1910.07113, 2019.
  3. Locomujoco: A comprehensive imitation learning benchmark for locomotion. 6th Robot Learning Workshop at NeurIPS, 2023.
  4. The arcade learning environment: An evaluation platform for general agents. Journal of Artificial Intelligence Research, 47:253–279, jun 2013.
  5. SAR: Generalization of Physiological Dexterity via Synergistic Action Representation. In Robotics: Science and Systems, 2023.
  6. Openai gym. arXiv preprint arXiv:1606.01540, 2016.
  7. Myosuite: A contact-rich simulation suite for musculoskeletal motor control. In Learning for Dynamics and Control, pages 492–507. PMLR, 2022.
  8. Myodex: a generalizable prior for dexterous manipulation. In International Conference on Machine Learning, pages 3327–3346. PMLR, 2023.
  9. Bi-dexhands: Towards human-level bimanual dexterous manipulation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023.
  10. Diffusion policy: Visuomotor policy learning via action diffusion. In Robotics: Science and Systems, 2023.
  11. One-shot imitation learning. In Advances in Neural Information Processing Systems, pages 1087–1098, 2017.
  12. Mobile aloha: Learning bimanual mobile manipulation with low-cost whole-body teleoperation. arXiv preprint arXiv:2401.02117, 2024.
  13. Dibya Ghosh. dibyaghosh/jaxrl_m, 2023. URL https://github.com/dibyaghosh/jaxrl_m.
  14. Relay policy learning: Solving long-horizon tasks via imitation and reinforcement learning. Conference on Robot Learning, 2019.
  15. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In International Conference on Machine Learning, pages 1856–1865, 2018.
  16. Learning agile soccer skills for a bipedal robot with deep reinforcement learning. arXiv preprint arXiv:2304.13653, 2023.
  17. Mastering diverse domains through world models. arXiv preprint arXiv:2301.04104, 2023.
  18. Td-mpc2: Scalable, robust world models for continuous control. In International Conference on Learning Representations, 2024.
  19. Furniturebench: Reproducible real-world benchmark for long-horizon complex manipulation. In Robotics: Science and Systems, 2023.
  20. Learning agile and dynamic motor skills for legged robots. Science Robotics, 4(26):eaau5872, 2019.
  21. Rlbench: The robot learning benchmark & learning environment. IEEE Robotics and Automation Letters, 2020.
  22. Robodesk: A multi-task reinforcement learning benchmark. https://github.com/google-research/robodesk, 2021.
  23. Rma: Rapid motor adaptation for legged robots. In Robotics: Science and Systems, 2021.
  24. Scalable muscle-actuated human simulation and control. ACM Transactions on Graphics, 38(4):1–13, 2019a.
  25. Composing complex skills by learning transition policies. In International Conference on Learning Representations, 2019b. URL https://openreview.net/forum?id=rygrBhC5tQ.
  26. Learning to coordinate manipulation skills via skill behavior diversification. In International Conference on Learning Representations, 2020.
  27. IKEA furniture assembly environment for long-horizon complex manipulation tasks. In IEEE International Conference on Robotics and Automation, 2021. URL https://clvrai.com/furniture.
  28. Behavior-1k: A benchmark for embodied ai with 1,000 everyday activities and realistic simulation. In Conference on Robot Learning, 2022.
  29. Softgym: Benchmarking deep reinforcement learning for deformable object manipulation. In Conference on Robot Learning, 2020.
  30. Discovered policy optimisation. Advances in Neural Information Processing Systems, 35:16455–16468, 2022.
  31. Isaac gym: High performance gpu based physics simulation for robot learning. In Neural Information Processing Systems Datasets and Benchmarks Track, 2021.
  32. What matters in learning from offline human demonstrations for robot manipulation. In Conference on Robot Learning, 2021.
  33. Mimo: A multi-modal infant model for studying cognitive development. IEEE Transactions on Cognitive and Developmental Systems, 2024.
  34. Calvin: A benchmark for language-conditioned policy learning for long-horizon robot manipulation tasks. IEEE Robotics and Automation Letters, 2022.
  35. Catch & carry: reusable neural controllers for vision-guided whole-body tasks. ACM Transactions on Graphics, 39(4):39–1, 2020.
  36. Humanoid multimodal tactile-sensing modules. IEEE Transactions on robotics, 27(3):401–410, 2011.
  37. Factory: Fast contact for robotic assembly. In Robotics: Science and Systems, 2022.
  38. Learning dexterous in-hand manipulation. The International Journal of Robotics Research, 39(1):3–20, 2020.
  39. Deepmimic: Example-guided deep reinforcement learning of physics-based character skills. ACM Transactions on Graphics, 37(4):1–14, 2018a.
  40. Sfv: Reinforcement learning of physical skills from videos. ACM Transactions on Graphics, 37(6):1–14, 2018b.
  41. Amp: Adversarial motion priors for stylized physics-based character control. ACM Transactions on Graphics, 40(4):1–20, 2021.
  42. Multi-goal reinforcement learning: Challenging robotics environments and request for research. arXiv preprint arXiv:1802.09464, 2018.
  43. Learning humanoid locomotion with transformers. arXiv preprint arXiv:2303.03381, 2023.
  44. Stable baselines3, 2019.
  45. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017.
  46. Sim-to-real for high-resolution optical tactile sensing: From images to three-dimensional contact force distributions. Soft Robotics, 9(5):926–937, 2022.
  47. The power of the senses: Generalizable manipulation from vision and touch through masked multimodal learning. arXiv preprint arXiv:2311.00924, 2023.
  48. Behavior: Benchmark for everyday household activities in virtual, interactive, and ecological environments. In Conference on Robot Learning, 2021.
  49. Habitat 2.0: Training home assistants to rearrange their habitat. In Neural Information Processing Systems, 2021.
  50. Deepmind control suite. arXiv preprint arXiv:1801.00690, 2018.
  51. dm_control: Software and tasks for continuous control. arXiv preprint arXiv:2006.12983, 2020.
  52. Mujoco: A physics engine for model-based control. In IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 5026–5033, 2012.
  53. Physhoi: Physics-based imitation of dynamic human-object interaction. arXiv preprint arXiv:2312.04393, 2023.
  54. Approximate convex decomposition for 3d meshes with collision-aware concavity and tree search. ACM Transactions on Graphics (TOG), 41(4):1–18, 2022.
  55. Hierarchical planning and control for box loco-manipulation. Symposium on Computer Animation, 2023.
  56. Meta-world: A benchmark and evaluation for multi-task and meta reinforcement learning. In Conference on Robot Learning, 2019.
  57. Robot synesthesia: In-hand manipulation with visuotactile sensing. arXiv preprint arXiv:2312.01853, 2023.
  58. MuJoCo Menagerie: A collection of high-quality simulation models for MuJoCo, 2022. URL http://github.com/google-deepmind/mujoco_menagerie.
  59. Robopianist: Dexterous piano playing with deep reinforcement learning. In Conference on Robot Learning, pages 2975–2994. PMLR, 2023.
  60. Learning physically simulated tennis skills from broadcast videos. ACM Transactions on Graphics, 42(4):1–14, 2023.
  61. Learning fine-grained bimanual manipulation with low-cost hardware. In Robotics: Science and Systems, 2023.
  62. robosuite: A modular simulation framework and benchmark for robot learning. arXiv preprint arXiv:2009.12293, 2020.
  63. Robot parkour learning. In Conference on Robot Learning, 2023.
Citations (16)

Summary

  • The paper introduces HumanoidBench, a simulated benchmark that rigorously evaluates whole-body locomotion and manipulation tasks in humanoid robots.
  • It employs the MuJoCo physics engine to replicate human-like kinematics and sensory feedback including egocentric vision and tactile sensing.
  • Initial RL evaluations reveal that current methods struggle with high-dimensional action spaces, highlighting the need for hierarchical approaches.

Introducing HumanoidBench: A Comprehensive Benchmark for Humanoid Robots in Locomotion and Manipulation Tasks

Overview

Research and development in humanoid robotics have been progressing with the aspiration to deploy humanoid robots in varied human environments. These robots, equipped with human-like forms and capabilities, can potentially revolutionize how tasks are performed, particularly in domains where human presence is risky or impractical. However, the development of effective locomotion and manipulation strategies for humanoid robots remains an enduring challenge. The intricate control, sophisticated part coordination, and the execution of complex tasks are significant hurdles.

In response to these challenges, a new benchmark titled HumanoidBench has been introduced. This benchmark is designed for whole-body locomotion and manipulation tasks, specifically tailored for humanoid robots. HumanoidBench is built upon simulated environments, leveraging the MuJoCo physics engine for realistic simulations.

Simulation Environment and Task Suite

The HumanoidBench environment features a simulated humanoid robot with two dexterous hands, modeled after real-world humanoid robots like the Unitree H1 and Agility Robotics Digit, alongside hand models such as the dexterous Shadow Hands. This setup is instrumental in exploring manipulation capabilities in addition to basic locomotion. This benchmark incorporates egocentric visual observations and whole-body tactile sensing, providing rich data for learning algorithms.

HumanoidBench offers a comprehensive suite of $27$ tasks separated into locomotion ($12$ tasks) and whole-body manipulation ($15$ tasks). The locomotion tasks include abilities like walking, running, and navigating mazes, while the manipulation tasks involve more complex actions such as organizing items on a shelf, playing basketball, and opening different types of cabinet doors. These tasks are constructed to present a gradient of challenges, from basic to highly involved tasks that necessitate precise coordination between various parts of the humanoid’s body.

Benchmarking Reinforcement Learning Algorithms

The initial evaluations conducted using state-of-the-art reinforcement learning (RL) algorithms, including DreamerV3, TD-MPC2, SAC, and PPO, highlight the complexity of the HumanoidBench tasks. The algorithms, particularly PPO and SAC, exhibited limited success across the task spectrum, emphasizing the current limitations in handling high-dimensional action and state spaces as well as the requirement for substantial sample efficiency and planning over long horizons.

The benchmark indicates a distinct gap between the current capabilities of RL algorithms and the complexity of tasks that humanoid robots are expected to perform. Notably, hierarchical reinforcement learning approaches showcased some promise, outperforming the flat, end-to-end methods on specific tasks, suggesting that incorporating structure and leveraging prelearned skills could be a direction for future research.

Implications and Future Directions

HumanoidBench sets the stage for rigorous and systematic evaluation of locomotion and manipulation strategies in humanoid robots. The performance of various algorithms on this benchmark not only underscores the current challenges but also opens new avenues for research in robot learning. For instance, the need for algorithms that can efficiently explore high-dimensional spaces, handle complex dynamics, and plan over extended time horizons is evident.

Moreover, HumanoidBench, with its focus on simulated environments, facilitates the testing and iteration of algorithms without the overhead of physical prototypes. This can accelerate the development cycle and enable more researchers to contribute to advancing humanoid robotics.

In conclusion, HumanoidBench serves as a foundational step towards realizing the full potential of humanoid robots. It presents a diverse array of tasks that mirror real-world applications, offering a comprehensive platform for benchmarking and advancing humanoid robot learning and control strategies.