Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

CTS: Concurrent Teacher-Student Reinforcement Learning for Legged Locomotion (2405.10830v2)

Published 17 May 2024 in cs.RO

Abstract: Thanks to recent explosive developments of data-driven learning methodologies, reinforcement learning (RL) emerges as a promising solution to address the legged locomotion problem in robotics. In this paper, we propose CTS, a novel Concurrent Teacher-Student reinforcement learning architecture for legged locomotion over uneven terrains. Different from conventional teacher-student architecture that trains the teacher policy via RL first and then transfers the knowledge to the student policy through supervised learning, our proposed architecture trains teacher and student policy networks concurrently under the reinforcement learning paradigm. To this end, we develop a new training scheme based on a modified proximal policy gradient (PPO) method that exploits data samples collected from the interactions between both the teacher and the student policies with the environment. The effectiveness of the proposed architecture and the new training scheme is demonstrated through substantial quantitative simulation comparisons with the state-of-the-art approaches and extensive indoor and outdoor experiments with quadrupedal and point-foot bipedal robot platforms, showcasing robust and agile locomotion capability. Quantitative simulation comparisons show that our approach reduces the average velocity tracking error by up to 20% compared to the two-stage teacher-student, demonstrating significant superiority in addressing blind locomotion tasks. Videos are available at https://clearlab-sustech.github.io/concurrentTS.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (28)
  1. M. Hutter, C. Gehring, D. Jud, A. Lauber, C. D. Bellicoso, V. Tsounis, J. Hwangbo, K. Bodie, P. Fankhauser, M. Bloesch, R. Diethelm, S. Bachmann, A. Melzer, and M. Hoepflinger, “Anymal - a highly mobile and dynamic quadrupedal robot,” in 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2016, pp. 38–44.
  2. C. Gehring, P. Fankhauser, L. Isler, R. Diethelm, S. Bachmann, M. Potz, L. Gerstenberg, and M. Hutter, “Anymal in the field: Solving industrial inspection of an offshore hvdc platform with a quadrupedal robot,” in Field and Service Robotics, G. Ishigami and K. Yoshida, Eds.   Singapore: Springer Singapore, 2021, pp. 247–260.
  3. Y.-H. Shin, S. Hong, S. Woo, J. Choe, H. Son, G. Kim, J.-H. Kim, K. Lee, J. Hwangbo, and H.-W. Park, “Design of kaist hound, a quadruped robot platform for fast and efficient locomotion with mixed-integer nonlinear optimization of a gear train,” in 2022 International Conference on Robotics and Automation (ICRA), 2022, pp. 6614–6620.
  4. G. Bledt, M. J. Powell, B. Katz, J. Di Carlo, P. M. Wensing, and S. Kim, “Mit cheetah 3: Design and control of a robust, dynamic quadruped robot,” in 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2018, pp. 2245–2252.
  5. B. Katz, J. D. Carlo, and S. Kim, “Mini cheetah: A platform for pushing the limits of dynamic quadruped control,” in 2019 International Conference on Robotics and Automation (ICRA), 2019, pp. 6295–6301.
  6. Y. Gong, R. Hartley, X. Da, A. Hereid, O. Harib, J.-K. Huang, and J. Grizzle, “Feedback control of a cassie bipedal robot: Walking, standing, and riding a segway,” in 2019 American Control Conference (ACC).   IEEE, 2019, pp. 4559–4566.
  7. Z. Hong, H. Chen, and W. Zhang, “Three-dimensional dynamic running with a point-foot biped based on differentially flat slip,” in 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).   IEEE, 2022, pp. 1169–1174.
  8. P. M. Wensing, M. Posa, Y. Hu, A. Escande, N. Mansard, and A. D. Prete, “Optimization-based control for dynamic legged robots,” IEEE Transactions on Robotics, vol. 40, pp. 43–63, 2024.
  9. J. Hwangbo, J. Lee, A. Dosovitskiy, D. Bellicoso, V. Tsounis, V. Koltun, and M. Hutter, “Learning agile and dynamic motor skills for legged robots,” Science Robotics, vol. 4, no. 26, p. eaau5872, 2019. [Online]. Available: https://www.science.org/doi/abs/10.1126/scirobotics.aau5872
  10. A. Kumar, Z. Fu, D. Pathak, and J. Malik, “RMA: Rapid Motor Adaptation for Legged Robots,” in Proceedings of Robotics: Science and Systems, Virtual, July 2021.
  11. G. B. Margolis, G. Yang, K. Paigwar, T. Chen, and P. Agrawal, “Rapid locomotion via reinforcement learning,” The International Journal of Robotics Research, vol. 43, no. 4, pp. 572–587, 2024.
  12. G. Ji, J. Mun, H. Kim, and J. Hwangbo, “Concurrent training of a control policy and a state estimator for dynamic and robust legged locomotion,” IEEE Robotics and Automation Letters, vol. 7, no. 2, pp. 4630–4637, 2022.
  13. N. Rudin, D. Hoeller, P. Reist, and M. Hutter, “Learning to walk in minutes using massively parallel deep reinforcement learning,” in Proceedings of the 5th Conference on Robot Learning, ser. Proceedings of Machine Learning Research, A. Faust, D. Hsu, and G. Neumann, Eds., vol. 164.   PMLR, 08–11 Nov 2022, pp. 91–100. [Online]. Available: https://proceedings.mlr.press/v164/rudin22a.html
  14. J. Lee, J. Hwangbo, L. Wellhausen, V. Koltun, and M. Hutter, “Learning quadrupedal locomotion over challenging terrain,” Science Robotics, vol. 5, no. 47, p. eabc5986, 2020. [Online]. Available: https://www.science.org/doi/abs/10.1126/scirobotics.abc5986
  15. J. Wu, G. Xin, C. Qi, and Y. Xue, “Learning robust and agile legged locomotion using adversarial motion priors,” IEEE Robotics and Automation Letters, vol. 8, no. 8, pp. 4975–4982, 2023.
  16. W. Wei, Z. Wang, A. Xie, J. Wu, R. Xiong, and Q. Zhu, “Learning gait-conditioned bipedal locomotion with motor adaptation*,” in 2023 IEEE-RAS 22nd International Conference on Humanoid Robots (Humanoids), 2023, pp. 1–7.
  17. T. Miki, J. Lee, J. Hwangbo, L. Wellhausen, V. Koltun, and M. Hutter, “Learning robust perceptive locomotion for quadrupedal robots in the wild,” Science Robotics, vol. 7, no. 62, p. eabk2822, 2022. [Online]. Available: https://www.science.org/doi/abs/10.1126/scirobotics.abk2822
  18. A. Agarwal, A. Kumar, J. Malik, and D. Pathak, “Legged locomotion in challenging terrains using egocentric vision,” in 6th Annual Conference on Robot Learning, 2022.
  19. D. Hoeller, N. Rudin, D. Sako, and M. Hutter, “Anymal parkour: Learning agile navigation for quadrupedal robots,” Science Robotics, vol. 9, no. 88, p. eadi7566, 2024. [Online]. Available: https://www.science.org/doi/abs/10.1126/scirobotics.adi7566
  20. Z. Zhuang, Z. Fu, J. Wang, C. G. Atkeson, S. Schwertfeger, C. Finn, and H. Zhao, “Robot parkour learning,” in Proceedings of The 7th Conference on Robot Learning, ser. Proceedings of Machine Learning Research, J. Tan, M. Toussaint, and K. Darvish, Eds., vol. 229.   PMLR, 06–09 Nov 2023, pp. 73–92. [Online]. Available: https://proceedings.mlr.press/v229/zhuang23a.html
  21. X. Cheng, K. Shi, A. Agarwal, and D. Pathak, “Extreme parkour with legged robots,” 2023.
  22. I. M. Aswin Nahrendra, B. Yu, and H. Myung, “Dreamwaq: Learning robust quadrupedal locomotion with implicit terrain imagination via deep reinforcement learning,” in 2023 IEEE International Conference on Robotics and Automation (ICRA), 2023, pp. 5078–5084.
  23. J. Long, Z. Wang, Q. Li, J. Gao, L. Cao, and J. Pang, “Hybrid internal model: Learning agile legged locomotion with simulated robot response,” 2024.
  24. J. Siekmann, S. Valluri, J. Dao, F. Bermillo, H. Duan, A. Fern, and J. Hurst, “Learning Memory-Based Control for Human-Scale Bipedal Locomotion,” in Proceedings of Robotics: Science and Systems, Corvalis, Oregon, USA, July 2020.
  25. J. Siekmann, K. Green, J. Warila, A. Fern, and J. Hurst, “Blind Bipedal Stair Traversal via Sim-to-Real Reinforcement Learning,” in Proceedings of Robotics: Science and Systems, Virtual, July 2021.
  26. J. Siekmann, Y. Godse, A. Fern, and J. Hurst, “Sim-to-real learning of all common bipedal gaits via periodic reward composition,” in 2021 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2021, pp. 7309–7315.
  27. J. Wu, Y. Xue, and C. Qi, “Learning multiple gaits within latent space for quadruped robots,” 2023.
  28. V. Makoviychuk, L. Wawrzyniak, Y. Guo, M. Lu, K. Storey, M. Macklin, D. Hoeller, N. Rudin, A. Allshire, A. Handa, and G. State, “Isaac gym: High performance GPU based physics simulation for robot learning,” in Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2), 2021.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com