Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Learning Quadruped Locomotion Using Differentiable Simulation (2403.14864v4)

Published 21 Mar 2024 in cs.RO and cs.AI

Abstract: This work explores the potential of using differentiable simulation for learning quadruped locomotion. Differentiable simulation promises fast convergence and stable training by computing low-variance first-order gradients using robot dynamics. However, its usage for legged robots is still limited to simulation. The main challenge lies in the complex optimization landscape of robotic tasks due to discontinuous dynamics. This work proposes a new differentiable simulation framework to overcome these challenges. Our approach combines a high-fidelity, non-differentiable simulator for forward dynamics with a simplified surrogate model for gradient backpropagation. This approach maintains simulation accuracy by aligning the robot states from the surrogate model with those of the precise, non-differentiable simulator. Our framework enables learning quadruped walking in simulation in minutes without parallelization. When augmented with GPU parallelization, our approach allows the quadruped robot to master diverse locomotion skills on challenging terrains in minutes. We demonstrate that differentiable simulation outperforms a reinforcement learning algorithm (PPO) by achieving significantly better sample efficiency while maintaining its effectiveness in handling large-scale environments. Our method represents one of the first successful applications of differentiable simulation to real-world quadruped locomotion, offering a compelling alternative to traditional RL methods.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (39)
  1. Legged locomotion in challenging terrains using egocentric vision. In Conference on Robot Learning, pages 403–415. PMLR, 2023.
  2. Implementing regularized predictive control for simultaneous real-time footstep and ground reaction force optimization. In 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 6316–6323. IEEE, 2019.
  3. JAX: composable transformations of Python+NumPy programs, 2018. URL http://github.com/google/jax.
  4. Learning by cheating. In Conference on Robot Learning, pages 66–75. PMLR, 2020.
  5. End-to-end differentiable physics for learning and control. Advances in neural information processing systems, 31, 2018.
  6. A survey on policy search for robotics. Foundations and Trends® in Robotics, 2(1–2):1–142, 2013.
  7. Dynamic locomotion in the mit cheetah 3 through convex model-predictive control. In 2018 IEEE/RSJ international conference on intelligent robots and systems (IROS), pages 1–9. IEEE, 2018.
  8. Brax–a differentiable physics engine for large scale rigid body simulation. arXiv preprint arXiv:2106.13281, 2021.
  9. Deep whole-body control: learning a unified policy for manipulation and locomotion. In Conference on Robot Learning, pages 138–149. PMLR, 2023.
  10. Perceptive locomotion through nonlinear model-predictive control. IEEE Transactions on Robotics, 2023.
  11. Real-time constrained nonlinear model predictive control on so (3) for dynamic legged locomotion. In 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 3982–3989. IEEE, 2020.
  12. Dojo: A differentiable simulator for robotics. arXiv preprint arXiv:2203.00806, 2022. URL https://arxiv.org/abs/2203.00806.
  13. Difftaichi: Differentiable programming for physical simulation. arXiv preprint arXiv:1910.00935, 2019.
  14. Plasticinelab: A soft-body manipulation benchmark with differentiable physics. arXiv preprint arXiv:2104.03311, 2021.
  15. Learning agile and dynamic motor skills for legged robots. Science Robotics, 4(26):eaau5872, 2019.
  16. Fast, robust quadruped locomotion over challenging terrain. In 2010 IEEE International Conference on Robotics and Automation, pages 2665–2670. IEEE, 2010.
  17. Mini cheetah: A platform for pushing the limits of dynamic quadruped control. In 2019 International Conference on Robotics and Automation (ICRA), pages 6295–6301, 2019. doi: 10.1109/ICRA.2019.8793865.
  18. A control architecture for quadruped locomotion over rough terrain. In 2008 IEEE International Conference on Robotics and Automation, pages 811–818. IEEE, 2008.
  19. Rma: Rapid motor adaptation for legged robots. arXiv preprint arXiv:2107.04034, 2021.
  20. Learning quadrupedal locomotion over challenging terrain. Science robotics, 5(47):eabc5986, 2020.
  21. Isaac gym: High performance gpu-based physics simulation for robot learning, 2021.
  22. Rapid locomotion via reinforcement learning. arXiv preprint arXiv:2205.02824, 2022.
  23. Gradients are not all you need. arXiv preprint arXiv:2111.05803, 2021.
  24. Learning robust perceptive locomotion for quadrupedal robots in the wild. Science Robotics, 7(62):eabk2822, 2022.
  25. Pods: Policy optimization via differentiable simulation. In International Conference on Machine Learning, pages 7805–7817. PMLR, 2021.
  26. Whole-body nonlinear model predictive control through contacts for quadrupeds. IEEE Robotics and Automation Letters, 3(3):1458–1465, 2018.
  27. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, 32, 2019.
  28. Marc H Raibert. Legged robots that balance. MIT press, 1986.
  29. Diffmimic: Efficient motion mimicking with differentiable physics. 2023.
  30. Learning to walk in minutes using massively parallel deep reinforcement learning. In Conference on Robot Learning, pages 91–100. PMLR, 2022.
  31. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017.
  32. Reaching the limit in autonomous racing: Optimal control versus reinforcement learning. Science Robotics, 8(82):eadg1462, 2023.
  33. Do differentiable simulators give better policy gradients? In International Conference on Machine Learning, pages 20668–20696. PMLR, 2022.
  34. Reinforcement learning: An introduction. MIT press, 2018.
  35. Ronald J Williams. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine learning, 8:229–256, 1992.
  36. Accelerated policy learning with parallel differentiable simulation. In International Conference on Learning Representations, 2021.
  37. Efficient tactile simulation with differentiability for robotic manipulation. In 6th Annual Conference on Robot Learning, 2022. URL https://openreview.net/forum?id=6BIffCl6gsM.
  38. Cajun: Continuous adaptive jumping using a learned centroidal controller. arXiv preprint arXiv:2306.09557, 2023.
  39. Robot parkour learning. arXiv preprint arXiv:2309.05665, 2023.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Yunlong Song (26 papers)
  2. Sangbae Kim (31 papers)
  3. Davide Scaramuzza (190 papers)
Citations (8)

Summary

We haven't generated a summary for this paper yet.

Youtube Logo Streamline Icon: https://streamlinehq.com