Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
133 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Two-Stage Learning of Highly Dynamic Motions with Rigid and Articulated Soft Quadrupeds (2309.09682v2)

Published 18 Sep 2023 in cs.RO

Abstract: Controlled execution of dynamic motions in quadrupedal robots, especially those with articulated soft bodies, presents a unique set of challenges that traditional methods struggle to address efficiently. In this study, we tackle these issues by relying on a simple yet effective two-stage learning framework to generate dynamic motions for quadrupedal robots. First, a gradient-free evolution strategy is employed to discover simply represented control policies, eliminating the need for a predefined reference motion. Then, we refine these policies using deep reinforcement learning. Our approach enables the acquisition of complex motions like pronking and back-flipping, effectively from scratch. Additionally, our method simplifies the traditionally labour-intensive task of reward shaping, boosting the efficiency of the learning process. Importantly, our framework proves particularly effective for articulated soft quadrupeds, whose inherent compliance and adaptability make them ideal for dynamic tasks but also introduce unique control challenges.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (38)
  1. B. Katz, J. Di Carlo, and S. Kim, “Mini cheetah: A platform for pushing the limits of dynamic quadruped control,” in Proc. IEEE Int. Conf. Robot. Autom., 2019, pp. 6295–6301.
  2. U. Robotics, “Unitree robot go1: https://m.unitree.com/products/go1,” 2021.
  3. C. Della Santina, M. G. Catalano, A. Bicchi, M. Ang, O. Khatib, and B. Siciliano, “Soft robots,” Ency. Robot., vol. 489, 2021.
  4. A. Badri-Sprowitz, A. Aghamaleki Sarvestani, M. Sitti, and M. A. ¨ Daley, “Birdbot achieves energy-efficient gait with minimal control using avian-inspired leg clutching,” Sci. Robot., vol. 7, no. 64, p.eabg4055, 2022.
  5. F. Bjelonic, J. Lee, P. Arm, D. Sako, D. Tateo, J. Peters, and M. Hutter, “Learning-based design and control for quadrupedal robots with parallel-elastic actuators,” IEEE Robot. Autom. Lett., vol. 8, no. 3, pp. 1611–1618, 2023.
  6. A. Raffin, D. Seidel, J. Kober, A. Albu-Schaffer, J. Silv ¨ erio, and F. Stulp, “Learning to exploit elastic actuators for quadruped locomotion,” arXiv preprint arXiv:2209.07171, 2022.
  7. Q. Nguyen, M. J. Powell, B. Katz, J. Di Carlo, and S. Kim, “Optimized jumping on the mit cheetah 3 robot,” in Proc. IEEE Int. Conf. Robot. Autom., 2019, pp. 7448–7454.
  8. C. Nguyen, L. Bao, and Q. Nguyen, “Continuous jumping for legged robots on stepping stones via trajectory optimization and model predictive control,” in Proc. IEEE Conf. Decis. Control, 2022, pp. 93–99.
  9. Z. Song, L. Yue, G. Sun, Y. Ling, H. Wei, L. Gui, and Y.-H. Liu, “An optimal motion planning framework for quadruped jumping,” in Proc. IEEE/RSJ Int. Conf. Intell. Robot. Syst., 2022, pp. 11 366–11 373.
  10. Y. Ding, A. Pandala, C. Li, Y.-H. Shin, and H.-W. Park, “Representation-free model predictive control for dynamic motions in quadrupeds,” IEEE Trans. Robot., vol. 37, no. 4, pp. 1154–1171, 2021.
  11. G. Garc´ıa, R. Griffin, and J. Pratt, “Time-varying model predictive control for highly dynamic motions of quadrupedal robots,” in Proc. IEEE Int. Conf. Robot. Autom., 2021, pp. 7344–7349.
  12. J. Hwangbo, J. Lee, A. Dosovitskiy, D. Bellicoso, V. Tsounis, V. Koltun, and M. Hutter, “Learning agile and dynamic motor skills for legged robots,” Sci. Robot., vol. 4, no. 26, p. eaau5872, 2019.
  13. Z. Fu, A. Kumar, J. Malik, and D. Pathak, “Minimizing energy consumption leads to the emergence of gaits in legged robots,” arXiv preprint arXiv:2111.01674, 2021.
  14. N. Rudin, D. Hoeller, P. Reist, and M. Hutter, “Learning to walk in minutes using massively parallel deep reinforcement learning,” in Conference on Robot Learning. PMLR, 2022, pp. 91–100.
  15. [18] Z. Tang, D. Kim, and S. Ha, “Learning agile motor skills on quadrupedal robots using curriculum learning,” in Proc. Int. Conf. Robot. Intell. Tech. Appl., vol. 3, 2021.
  16. [19] G. B. Margolis, G. Yang, K. Paigwar, T. Chen, and P. Agrawal, “Rapid locomotion via reinforcement learning,” arXiv preprint arXiv:2205.02824, 2022.
  17. [20] A. Iscen, G. Yu, A. Escontrela, D. Jain, J. Tan, and K. Caluwaerts, “Learning agile locomotion skills with a mentor,” in Proc. IEEE Int. Conf. Robot. Autom., 2021, pp. 2019–2025.
  18. S. Chen, B. Zhang, M. W. Mueller, A. Rai, and K. Sreenath, “Learning torque control for quadrupedal locomotion,” arXiv preprint arXiv:2203.05194, 2022.
  19. G. B. Margolis, T. Chen, K. Paigwar, X. Fu, D. Kim, S. Kim, and P. Agrawal, “Learning to jump from pixels,” arXiv preprint arXiv:2110.15344, 2021.
  20. Y. Fuchioka, Z. Xie, and M. Van de Panne, “Opt-mimic: Imitation of optimized trajectories for dynamic quadruped behaviors,” in Proc. IEEE Int. Conf. Robot. Autom., 2023, pp. 5092–5098.
  21. M. Bogdanovic, M. Khadiv, and L. Righetti, “Model-free reinforcement learning for robust locomotion using demonstrations from trajectory optimization,” arXiv preprint arXiv:2107.06629, 2021.
  22. C. Li, S. Blaes, P. Kolev, M. Vlastelica, J. Frey, and G. Martius, “Versatile skill control via self-supervised adversarial imitation of unlabeled mixed motions,” arXiv preprint arXiv:2209.07899, 2022.
  23. C. Li, M. Vlastelica, S. Blaes, J. Frey, F. Grimminger, and G. Martius, “Learning agile skills via adversarial imitation of rough partial demonstrations,” in Conference on Robot Learning. PMLR, 2023, pp. 342–352.
  24. L. Smith, J. C. Kew, T. Li, L. Luu, X. B. Peng, S. Ha, J. Tan, and S. Levine, “Learning and adapting agile locomotion skills by transferring experience,” arXiv preprint arXiv:2304.09834, 2023.
  25. Y. Yang, X. Meng, W. Yu, T. Zhang, J. Tan, and B. Boots, “Continuous versatile jumping using learned action residuals,” in Learning for Dynamics and Control Conference. PMLR, 2023, pp. 770–782.
  26. Y. Yang, G. Shi, X. Meng, W. Yu, T. Zhang, J. Tan, and B. Boots, “Cajun: Continuous adaptive jumping using a learned centroidal controller,” arXiv preprint arXiv:2306.09557, 2023.
  27. Z. Li, X. B. Peng, P. Abbeel, S. Levine, G. Berseth, and K. Sreenath, “Robust and versatile bipedal jumping control through multi-task reinforcement learning,” arXiv preprint arXiv:2302.09450, 2023.
  28. H. Mania, A. Guy, and B. Recht, “Simple random search of static linear policies is competitive for reinforcement learning,” Adv. Neural Inf. Process., vol. 31, 2018.
  29. J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algorithms,” arXiv preprint arXiv:1707.06347, 2017.
  30. H. Shi, B. Zhou, H. Zeng, F. Wang, Y. Dong, J. Li, K. Wang, H. Tian, and M. Q.-H. Meng, “Reinforcement learning with evolutionary trajectory generator: A general approach for quadrupedal locomotion,” IEEE Robot. Autom. Lett., vol. 7, no. 2, pp. 3085–3092, 2022.
  31. T. Salimans, J. Ho, X. Chen, S. Sidor, and I. Sutskever, “Evolution strategies as a scalable alternative to reinforcement learning,” arXiv preprint arXiv:1703.03864, 2017.
  32. A. Y. Majid, S. Saaybi, V. Francois-Lavet, R. V. Prasad, and C. Verhoeven, “Deep reinforcement learning versus evolution strategies: a comparative survey,” IEEE Trans. Neural Netw. Learn., 2023.
  33. N. Hansen, D. V. Arnold, and A. Auger, “Evolution strategies,” Springer handbook of computational intelligence, pp. 871–898, 2015.
  34. E. Coumans and Y. Bai, “Pybullet, a python module for physics simulation for games, robotics and machine learning,” http://pybullet.org, 2016–2019.
  35. A. Raffin, “Rl baselines3 zoo,” https://github.com/DLR-RM/ rl-baselines3-zoo, 2020.
  36. Z. Xie, X. Da, M. Van de Panne, B. Babich, and A. Garg, “Dynamics randomization revisited: A case study for quadrupedal locomotion,” in Proc. IEEE Int. Conf. Robot. Autom., 2021, pp. 4955–4961.
  37. J. Ding, X. Xiao, N. G. Tsagarakis, and Y. Huang, “Robust gait synthesis combining constrained optimization and imitation learning,” in Proc. IEEE/RSJ Int. Conf. Intell. Robot. Syst., 2020, pp. 3473–3480.
  38. J. Ding, X. Xiao, and N. G. Tsagarakis, “Nonlinear optimization of step duration and step location in Proc. IEEE/RSJ Int. Conf. Intell. Robot. Syst., 2019, pp. 2849–2854.
Citations (5)

Summary

We haven't generated a summary for this paper yet.