Symmetry Considerations for Learning Task Symmetric Robot Policies (2403.04359v1)
Abstract: Symmetry is a fundamental aspect of many real-world robotic tasks. However, current deep reinforcement learning (DRL) approaches can seldom harness and exploit symmetry effectively. Often, the learned behaviors fail to achieve the desired transformation invariances and suffer from motion artifacts. For instance, a quadruped may exhibit different gaits when commanded to move forward or backward, even though it is symmetrical about its torso. This issue becomes further pronounced in high-dimensional or complex environments, where DRL methods are prone to local optima and fail to explore regions of the state space equally. Past methods on encouraging symmetry for robotic tasks have studied this topic mainly in a single-task setting, where symmetry usually refers to symmetry in the motion, such as the gait patterns. In this paper, we revisit this topic for goal-conditioned tasks in robotics, where symmetry lies mainly in task execution and not necessarily in the learned motions themselves. In particular, we investigate two approaches to incorporate symmetry invariance into DRL -- data augmentation and mirror loss function. We provide a theoretical foundation for using augmented samples in an on-policy setting. Based on this, we show that the corresponding approach achieves faster convergence and improves the learned behaviors in various challenging robotic tasks, from climbing boxes with a quadruped to dexterous manipulation.
- T. Miki, J. Lee, J. Hwangbo, L. Wellhausen, V. Koltun, and M. Hutter, “Learning robust perceptive locomotion for quadrupedal robots in the wild,” Science Robotics, vol. 7, no. 62, 2022.
- J. Lee, J. Hwangbo, L. Wellhausen, V. Koltun, and M. Hutter, “Learning quadrupedal locomotion over challenging terrain,” Science Robotics, vol. 5, no. 47, p. eabc5986, 2020.
- A. Kumar, Z. Fu, D. Pathak, and J. Malik, “Rma: Rapid motor adaptation for legged robots,” Robotics: Science and Systems, 2021.
- A. Allshire, M. Mittal, V. Lodaya, V. Makoviychuk, D. Makoviichuk, F. Widmaier, M. Wüthrich, S. Bauer, A. Handa, and A. Garg, “Transferring dexterous manipulation from gpu simulation to a remote real-world trifinger,” IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2022.
- I. Akkaya, M. Andrychowicz, M. Chociej, M. Litwin, B. McGrew, A. Petron, A. Paino, M. Plappert, G. Powell, R. Ribas, et al., “Solving rubik’s cube with a robot hand,” arXiv preprint arXiv:1910.07113, 2019.
- N. Rudin, D. Hoeller, M. Bjelonic, and M. Hutter, “Advanced skills by learning locomotion and local navigation end-to-end,” in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2022, pp. 2497–2503.
- W. Yu, G. Turk, and C. K. Liu, “Learning symmetric and low-energy locomotion,” ACM Trans. Graph., vol. 37, no. 4, jul 2018.
- E. Van der Pol, D. Worrall, H. van Hoof, F. Oliehoek, and M. Welling, “Mdp homomorphic networks: Group symmetries in reinforcement learning,” Advances in Neural Information Processing Systems, vol. 33, pp. 4199–4210, 2020.
- S. Coros, A. Karpathy, B. Jones, L. Reveret, and M. van de Panne, “Locomotion skills for simulated quadrupeds,” ACM Transactions on Graphics, vol. 30, no. 4, 2011.
- A. Majkowska and P. Faloutsos, “Flipping with Physics: Motion Editing for Acrobatics,” in Eurographics/SIGGRAPH Symposium on Computer Animation, 2007.
- G. Bellegarda and A. Ijspeert, “Cpg-rl: Learning central pattern generators for quadruped locomotion,” IEEE Robotics and Automation Letters, vol. 7, no. 4, pp. 12 547–12 554, 2022.
- F. Abdolhosseini, H. Y. Ling, Z. Xie, X. B. Peng, and M. Van de Panne, “On learning symmetric locomotion,” in ACM SIGGRAPH Conference on Motion, Interaction and Games, 2019, pp. 1–10.
- L. Liu, M. van de Panne, and K. Yin, “Guided learning of control graphs for physics-based characters,” ACM Transactions on Graphics, vol. 35, no. 3, 2016.
- N. Rudin, H. Kolvenbach, V. Tsounis, and M. Hutter, “Cat-like jumping and landing of legged robots in low gravity using deep reinforcement learning,” IEEE Transactions on Robotics, vol. 38, pp. 317–328, 2021.
- Y. Lin, J. Huang, M. Zimmer, Y. Guan, J. Rojas, and P. Weng, “Invariant transform experience replay: Data augmentation for deep reinforcement learning,” IEEE Robotics and Automation Letters, vol. 5, no. 4, pp. 6615–6622, 2020.
- M. Abreu, L. P. Reis, and N. Lau, “Addressing imperfect symmetry: a novel symmetry-learning actor-critic extension,” arXiv preprint arXiv:2309.02711, 2023.
- R. Wang, R. Walters, and R. Yu, “Incorporating symmetry into deep dynamics models for improved generalization,” in International Conference on Learning Representations, 2021.
- D. Wang, R. Walters, and R. Platt, “SO(2)SO2\mathrm{SO}(2)roman_SO ( 2 )-equivariant reinforcement learning,” in International Conference on Learning Representations, 2022.
- D. Ordonez-Apraez, M. Martin, A. Agudo, and F. Moreno-Noguer, “On discrete symmetries of robotics systems: A group-theoretic and data-driven analysis,” Robotics: Science and Systems, 2023.
- M. Hutter, C. Gehring, A. Lauber, F. Günther, C. D. Bellicoso, V. Tsounis, P. Fankhauser, R. Diethelm, S. Bachmann, M. Blösch, et al., “Anymal-toward legged robots for harsh environments,” Advanced Robotics, vol. 31, no. 17, pp. 918–931, 2017.
- J. Schulman, S. Levine, P. Abbeel, M. Jordan, and P. Moritz, “Trust region policy optimization,” in International Conference on Machine Learning, vol. 37, 07–09 Jul 2015, pp. 1889–1897.
- J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algorithms,” arXiv preprint arXiv:1707.06347, 2017.
- B. Ravindran and A. G. Barto, “Symmetries and model minimization in markov decision processes,” University of Massachusetts, USA, Tech. Rep., 2001.
- M. Zinkevich and T. R. Balch, “Symmetry in markov decision processes and its implications for single agent and multiagent learning,” in International Conference on Machine Learning, 2001, p. 632.
- S. Yan, Y. Zhang, B. Zhang, J. Boedecker, and W. Burgard, “Geometric regularity with robot intrinsic symmetry in reinforcement learning,” in RSS 2023 Workshop on Symmetries in Robot Learning, 2023.
- A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in Advances in Neural Information Processing Systems, vol. 25, 2012.
- M. Laskin, K. Lee, A. Stooke, L. Pinto, P. Abbeel, and A. Srinivas, “Reinforcement learning with augmented data,” Advances in neural information processing systems, vol. 33, pp. 19 884–19 895, 2020.
- V. Makoviychuk, L. Wawrzyniak, Y. Guo, M. Lu, K. Storey, M. Macklin, D. Hoeller, N. Rudin, A. Allshire, A. Handa, et al., “Isaac gym: High performance gpu based physics simulation for robot learning,” in Neural Information Processing Systems Datasets and Benchmarks Track (Round 2), 2021.
- M. Wuthrich, F. Widmaier, F. Grimminger, S. Joshi, V. Agrawal, B. Hammoud, M. Khadiv, M. Bogdanovic, V. Berenz, J. Viereck, M. Naveau, L. Righetti, B. Schölkopf, and S. Bauer, “Trifinger: An open-source robot for learning dexterity,” in Conference on Robot Learning, vol. 155, 2021, pp. 1871–1882.