Learning Quadrupedal Locomotion via Differentiable Simulation (2404.02887v1)
Abstract: The emergence of differentiable simulators enabling analytic gradient computation has motivated a new wave of learning algorithms that hold the potential to significantly increase sample efficiency over traditional Reinforcement Learning (RL) methods. While recent research has demonstrated performance gains in scenarios with comparatively smooth dynamics and, thus, smooth optimization landscapes, research on leveraging differentiable simulators for contact-rich scenarios, such as legged locomotion, is scarce. This may be attributed to the discontinuous nature of contact, which introduces several challenges to optimizing with analytic gradients. The purpose of this paper is to determine if analytic gradients can be beneficial even in the face of contact. Our investigation focuses on the effects of different soft and hard contact models on the learning process, examining optimization challenges through the lens of contact simulation. We demonstrate the viability of employing analytic gradients to learn physically plausible locomotion skills with a quadrupedal robot using Short-Horizon Actor-Critic (SHAC), a learning algorithm leveraging analytic gradients, and draw a comparison to a state-of-the-art RL algorithm, Proximal Policy Optimization (PPO), to understand the benefits of analytic gradients.
- H. J. Suh, M. Simchowitz, K. Zhang, and R. Tedrake, “Do differentiable simulators give better policy gradients?” in International Conference on Machine Learning. PMLR, 2022, pp. 20 668–20 696.
- J. Degrave, M. Hermans, J. Dambre, and F. Wyffels, “A differentiable physics engine for deep learning in robotics,” Frontiers in Neurorobotics, vol. 13, no. March, pp. 1–9, 2019.
- Y. Hu, L. Anderson, T.-M. Li, Q. Sun, N. Carr, J. Ragan-Kelley, and F. Durand, “Difftaichi: Differentiable programming for physical simulation,” in ICLR, 2019.
- C. D. Freeman, E. Frey, A. Raichuk, S. Girgin, I. Mordatch, and O. Bachem, “Brax-a differentiable physics engine for large scale rigid body simulation,” in Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 1), 2021.
- J. Xu, V. Makoviychuk, Y. Narang, F. Ramos, W. Matusik, A. Garg, and M. Macklin, “Accelerated policy learning with parallel differentiable simulation,” in International Conference on Learning Representations, 2021.
- M. Geilinger, D. Hahn, J. Zehnder, M. Bächer, B. Thomaszewski, and S. Coros, “ADD: Analytically differentiable dynamics for multi-body systems with frictional contact,” ACM Transactions on Graphics, vol. 39, no. 6, 2020.
- J.-J. Moreau, “Some basics of unilateral dynamics,” in IUTAM Symposium on Unilateral Multibody Contacts. Springer, 1999, pp. 1–14.
- T. Pang, H. J. Suh, L. Yang, and R. Tedrake, “Global Planning for Contact-Rich Manipulation via Local Smoothing of Quasi-Dynamic Contact Models,” IEEE Transactions on Robotics, pp. 1–20, 2023.
- J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algorithms,” arXiv e-prints, 2017.
- N. Rudin, D. Hoeller, P. Reist, and M. Hutter, “Learning to walk in minutes using massively parallel deep reinforcement learning,” in Conference on Robot Learning. PMLR, 2022, pp. 91–100.
- F. d. A. Belbute-Peres, K. R. Allen, K. A. Smith, J. B. Tenenbaum, and J. Zico Kolter, “End-to-end differentiable physics for learning and control,” Advances in Neural Information Processing Systems, pp. 7178–7189, 2018.
- K. Werling, D. Omens, J. Lee, I. Exarchos, and C. K. Liu, “Fast and Feature-Complete Differentiable Physics for Articulated Rigid Bodies with Contact,” Robotics: Science and Systems, 2021.
- M. Macklin, “Warp: A high-performance python framework for gpu simulation and graphics,” https://github.com/nvidia/warp, March 2022, .NVIDIA GPU Technology Conference (GTC).
- M. Macklin, M. Müller, and N. Chentanez, “Xpbd: position-based simulation of compliant constrained dynamics,” in Proceedings of the 9th International Conference on Motion in Games, 2016, pp. 49–54.
- E. Todorov, T. Erez, and Y. Tassa, “Mujoco: A physics engine for model-based control,” 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 5026–5033, 2012.
- T. A. Howell, S. Le Cleac’h, J. Z. Kolter, M. Schwager, and Z. Manchester, “Dojo: A differentiable simulator for robotics,” arXiv preprint arXiv:2203.00806, vol. 9, 2022.
- Z. Huang, Y. Hu, T. Du, S. Zhou, H. Su, J. B. Tenenbaum, and C. Gan, “Plasticinelab: a Soft-Body Manipulation Benchmark With Differentiable Physics,” ICLR 2021 - 9th International Conference on Learning Representations, pp. 1–18, 2021.
- S. Le Cleac’h, H. X. Yu, M. Guo, T. Howell, R. Gao, J. Wu, Z. Manchester, and M. Schwager, “Differentiable Physics Simulation of Dynamics-Augmented Neural Objects,” IEEE Robotics and Automation Letters, vol. 8, no. 5, pp. 2780–2787, 2023.
- J. K. Murthy, M. Macklin, F. Golemo, V. Voleti, L. Petrini, M. Weiss, B. Considine, J. Parent-Lévesque, K. Xie, K. Erleben et al., “gradsim: Differentiable simulation for system identification and visuomotor control,” in ICLR, 2020.
- D. Turpin, L. Wang, E. Heiden, Y.-C. Chen, M. Macklin, S. Tsogkas, S. Dickinson, and A. Garg, “Grasp’d: Differentiable contact-rich grasp synthesis for multi-fingered hands,” in European Conference on Computer Vision. Springer, 2022, pp. 201–221.
- R. Antonova, J. Yang, K. M. Jatavallabhula, and J. Bohg, “Rethinking optimization with differentiable simulation from a global perspective,” in Conference on Robot Learning. PMLR, 2023, pp. 276–286.
- H. J. T. Suh, T. Pang, and R. Tedrake, “Bundled Gradients Through Contact Via Randomized Smoothing,” IEEE Robotics and Automation Letters, vol. 7, no. 2, pp. 4000–4007, 2022.
- N. Wiedemann, V. Wüest, A. Loquercio, M. Müller, D. Floreano, and D. Scaramuzza, “Training efficient controllers via analytic policy gradient,” in 2023 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2023, pp. 1349–1356.
- M. A. Z. Mora, M. Peychev, S. Ha, M. Vechev, S. Coros, M. Zamora, M. Peychev, S. Ha, M. Vechev, and S. Coros, “PODS: Policy Optimization via Differentiable Simulation,” 38th International Conference on Machine Learning, pp. 7805–7817, 2021.
- Y. L. Qiao, J. Liang, V. Koltun, and M. C. Lin, “Efficient Differentiable Simulation of Articulated Bodies,” Proceedings of Machine Learning Research, vol. 139, pp. 8661–8671, 2021.
- J. J. Moreau, “Unilateral contact and dry friction in finite freedom dynamics,” in Nonsmooth mechanics and Applications. Springer, 1988, pp. 1–82.
- J. Carius, R. Ranftl, V. Koltun, and M. Hutter, “Trajectory optimization with implicit hard contacts,” IEEE Robotics and Automation Letters, vol. 3, no. 4, pp. 3316–3323, 2018.
- J. Bender, K. Erleben, and J. Trinkle, “Interactive simulation of rigid body dynamics in computer graphics,” Computer Graphics Forum, vol. 33, no. 1, pp. 246–270, 2014.
- Clemens Schwarke (2 papers)
- Victor Klemm (13 papers)
- Jesus Tordesillas (19 papers)
- Jean-Pierre Sleiman (10 papers)
- Marco Hutter (165 papers)