Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
149 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

End-to-End Reinforcement Learning for Torque Based Variable Height Hopping (2307.16676v2)

Published 31 Jul 2023 in cs.RO, cs.LG, cs.SY, and eess.SY

Abstract: Legged locomotion is arguably the most suited and versatile mode to deal with natural or unstructured terrains. Intensive research into dynamic walking and running controllers has recently yielded great advances, both in the optimal control and reinforcement learning (RL) literature. Hopping is a challenging dynamic task involving a flight phase and has the potential to increase the traversability of legged robots. Model based control for hopping typically relies on accurate detection of different jump phases, such as lift-off or touch down, and using different controllers for each phase. In this paper, we present a end-to-end RL based torque controller that learns to implicitly detect the relevant jump phases, removing the need to provide manual heuristics for state detection. We also extend a method for simulation to reality transfer of the learned controller to contact rich dynamic tasks, resulting in successful deployment on the robot after training without parameter tuning.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (40)
  1. K. Mombaur, J.-P. Laumond, and E. Yoshida, “An optimal control model unifying holonomic and nonholonomic walking,” in Humanoids 2008-8th IEEE-RAS International Conference on Humanoid Robots.   IEEE, 2008, pp. 646–653.
  2. H. Osumi, S. Kamiya, H. Kato, K. Umeda, R. Ueda, and T. Arai, “Time optimal control for quadruped walking robots,” in Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006.   IEEE, 2006, pp. 1102–1108.
  3. A. Hereid, E. A. Cousineau, C. M. Hubicki, and A. D. Ames, “3d dynamic walking with underactuated humanoid robots: A direct collocation framework for optimizing hybrid zero dynamics,” in 2016 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2016, pp. 1447–1454.
  4. S. Chen, B. Zhang, M. W. Mueller, A. Rai, and K. Sreenath, “Learning torque control for quadrupedal locomotion,” arXiv preprint arXiv:2203.05194, 2022.
  5. J. Hwangbo, J. Lee, A. Dosovitskiy, D. Bellicoso, V. Tsounis, V. Koltun, and M. Hutter, “Learning agile and dynamic motor skills for legged robots,” Science Robotics, vol. 4, no. 26, p. eaau5872, 2019.
  6. Z. Li, X. Cheng, X. B. Peng, P. Abbeel, S. Levine, G. Berseth, and K. Sreenath, “Reinforcement learning for robust parameterized locomotion control of bipedal robots,” in 2021 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2021, pp. 2811–2817.
  7. D. Jain, A. Iscen, and K. Caluwaerts, “Hierarchical reinforcement learning for quadruped locomotion,” in 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).   IEEE, 2019, pp. 7551–7557.
  8. Z. Xie, P. Clary, J. Dao, P. Morais, J. Hurst, and M. Panne, “Learning locomotion skills for cassie: Iterative design and sim-to-real,” in Conference on Robot Learning.   PMLR, 2020, pp. 317–329.
  9. G. B. Margolis, G. Yang, K. Paigwar, T. Chen, and P. Agrawal, “Rapid locomotion via reinforcement learning,” arXiv preprint arXiv:2205.02824, 2022.
  10. E. W. Hawkes, C. Xiao, R.-A. Peloquin, C. Keeley, M. R. Begley, M. T. Pope, and G. Niemeyer, “Engineered jumpers overcome biological limits via work multiplication,” Nature, vol. 604, no. 7907, pp. 657–661, 2022.
  11. G. Bellegarda and Q. Nguyen, “Robust quadruped jumping via deep reinforcement learning,” arXiv preprint arXiv:2011.07089, 2020.
  12. M. H. Raibert, H. B. Brown Jr, and M. Chepponis, “Experiments in balance with a 3d one-legged hopping machine,” The International Journal of Robotics Research, vol. 3, no. 2, pp. 75–92, 1984.
  13. M. H. Raibert, “Hopping in legged systems—modeling and simulation for the two-dimensional one-legged case,” IEEE Transactions on Systems, Man, and Cybernetics, no. 3, pp. 451–463, 1984.
  14. G. J. Zeglin, “Uniroo–a one legged dynamic hopping robot,” Ph.D. dissertation, Massachusetts Institute of Technology, 1991.
  15. K. Harbick and G. S. Sukhatme, “Controlling hopping height of a pneumatic monopod,” in Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No. 02CH37292), vol. 4.   IEEE, 2002, pp. 3998–4003.
  16. A. Lebaudy, J. Presser, and M. Kam, “Control algorithms for a vertically-constrained one-legged hopping machine,” in Proceedings of 32nd IEEE Conference on Decision and Control.   IEEE, 1993, pp. 2688–2693.
  17. G. Zhao, F. Szymanski, and A. Seyfarth, “Bio-inspired neuromuscular reflex based hopping controller for a segmented robotic leg,” Bioinspiration & biomimetics, vol. 15, no. 2, p. 026007, 2020.
  18. J. Ramos, Y. Ding, Y.-w. Sim, K. Murphy, and D. Block, “Hoppy: an open-source and low-cost kit for dynamic robotics education,” arXiv preprint arXiv:2010.14580, 2020.
  19. A. Pinkus, “Approximation theory of the mlp model in neural networks,” Acta numerica, vol. 8, pp. 143–195, 1999.
  20. Z. Lu, H. Pu, F. Wang, Z. Hu, and L. Wang, “The expressive power of neural networks: A view from the width,” Advances in neural information processing systems, vol. 30, 2017.
  21. Y. Kuang, S. Wang, B. Sun, J. Hao, and H. Cheng, “Learning jumping skills from human with a fast reinforcement learning framework,” in 2018 IEEE 8th Annual International Conference on CYBER Technology in Automation, Control, and Intelligent Systems (CYBER).   IEEE, 2018, pp. 510–515.
  22. M. Bogdanovic, M. Khadiv, and L. Righetti, “Model-free reinforcement learning for robust locomotion using trajectory optimization for exploration,” arXiv preprint arXiv:2107.06629, 2021.
  23. D. M. Wolpert, J. Diedrichsen, and J. R. Flanagan, “Principles of sensorimotor learning,” Nature reviews neuroscience, vol. 12, no. 12, pp. 739–751, 2011.
  24. M. Kaspar, J. D. M. Osorio, and J. Bock, “Sim2real transfer for reinforcement learning without dynamics randomization,” in 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).   IEEE, 2020, pp. 4383–4388.
  25. mjbots, “qdd100 beta 3 servo,” https://mjbots.com/collections/servos-and-controllers/products/qdd100-beta-3, accessed: 10-09-2022.
  26. S. Hyon, T. Emura, and T. Mita, “Dynamics-based control of a one-legged hopping robot,” Proceedings of the Institution of Mechanical Engineers, Part I: Journal of Systems and Control Engineering, vol. 217, no. 2, pp. 83–98, 2003.
  27. J. Di Carlo, “Software and control design for the mit cheetah quadruped robots,” Ph.D. dissertation, Massachusetts Institute of Technology, 2020.
  28. E. Coumans and Y. Bai, “Pybullet, a python module for physics simulation for games, robotics and machine learning,” http://pybullet.org, 2016.
  29. T. Haarnoja, A. Zhou, P. Abbeel, and S. Levine, “Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor,” 2018. [Online]. Available: https://arxiv.org/abs/1801.01290
  30. A. Raffin and F. Stulp, “Generalized state-dependent exploration for deep reinforcement learning in robotics,” Arxiv, 2020.
  31. G. Brockman, V. Cheung, L. Pettersson, J. Schneider, J. Schulman, J. Tang, and W. Zaremba, “Openai gym,” 2016.
  32. E. Todorov, T. Erez, and Y. Tassa, “Mujoco: A physics engine for model-based control,” in 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2012, pp. 5026–5033.
  33. M. Körber, J. Lange, S. Rediske, S. Steinmann, and R. Glück, “Comparing popular simulation environments in the scope of robotics and reinforcement learning,” 2021. [Online]. Available: https://arxiv.org/abs/2103.04616
  34. N. Hansen and A. Ostermeier, “Completely derandomized self-adaptation in evolution strategies,” Evolutionary Computation, vol. 9, no. 2, pp. 159–195, 2001.
  35. X. B. Peng, M. Andrychowicz, W. Zaremba, and P. Abbeel, “Sim-to-real transfer of robotic control with dynamics randomization,” in 2018 IEEE international conference on robotics and automation (ICRA).   IEEE, 2018, pp. 3803–3810.
  36. T. Haarnoja, S. Ha, A. Zhou, J. Tan, G. Tucker, and S. Levine, “Learning to walk via deep reinforcement learning,” arXiv preprint arXiv:1812.11103, 2018.
  37. F. Wiebe, S. Vyas, L. J. Maywald, S. Kumar, and F. Kirchner, “RealAIGym: Education and Research Platform for Studying Athletic Intelligence,” in Proceedings of Robotics Science and Systems Workshop Mind the Gap: Opportunities and Challenges in the Transition Between Research and Industry, New York, July 2022.
  38. F. Wiebe, J. Babel, S. Kumar, S. Vyas, D. Harnack, M. Boukheddimi, M. Popescu, and F. Kirchner, “Torque-limited simple pendulum: A toolkit for getting familiar with control algorithms in underactuated robotics,” Journal of Open Source Software, vol. 7, no. 74, p. 3884, 2022. [Online]. Available: https://doi.org/10.21105/joss.03884
  39. F. Wiebe, S. Kumar, L. Shala, S. Vyas, M. Javadi, and F. Kirchner, “An open source dual purpose acrobot and pendubot platform for benchmarking control algorithms for underactuated robotics,” IEEE Robotics and Automation Magazine, 2023, under review.
  40. M. Javadi, D. Harnack, P. Stocco, S. Kumar, S. Vyas, D. Pizzutilo, and F. Kirchner, “Acromonk: A minimalist underactuated brachiating robot,” IEEE Robotics and Automation Letters, vol. 8, no. 6, pp. 3637–3644, 2023.
Citations (6)

Summary

We haven't generated a summary for this paper yet.