Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

On the Role of the Action Space in Robot Manipulation Learning and Sim-to-Real Transfer (2312.03673v2)

Published 6 Dec 2023 in cs.RO and cs.LG

Abstract: We study the choice of action space in robot manipulation learning and sim-to-real transfer. We define metrics that assess the performance, and examine the emerging properties in the different action spaces. We train over 250 reinforcement learning~(RL) agents in simulated reaching and pushing tasks, using 13 different control spaces. The choice of spaces spans combinations of common action space design characteristics. We evaluate the training performance in simulation and the transfer to a real-world environment. We identify good and bad characteristics of robotic action spaces and make recommendations for future designs. Our findings have important implications for the design of RL algorithms for robot manipulation tasks, and highlight the need for careful consideration of action spaces when training and transferring RL agents for real-world robotics.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (33)
  1. Solving rubik’s cube with a robot hand. arXiv preprint arXiv:1910.07113, 2019.
  2. Learning vision-based reactive policies for obstacle avoidance. In Conference on Robot Learning, pages 2040–2054. PMLR, 2021.
  3. Clas: Coordinating multi-robot manipulation with central latent action spaces. In Learning for Dynamics and Control Conference, pages 1152–1166. PMLR, 2023.
  4. Learning to centralize dual-arm assembly. Frontiers in Robotics and AI, 9:830007, 2022.
  5. Laser: Learning a latent action space for efficient reinforcement learning. In 2021 IEEE International Conference on Robotics and Automation (ICRA), pages 6650–6656. IEEE, 2021.
  6. Neural dynamic policies for end-to-end sensorimotor learning. Advances in Neural Information Processing Systems, 33:5058–5069, 2020.
  7. Learning force control for contact-rich manipulation tasks with rigid position-controlled robots. IEEE Robotics and Automation Letters, 5(4):5709–5716, 2020.
  8. Learning variable impedance control for contact sensitive tasks. IEEE Robotics and Automation Letters, 5(4):6129–6136, 2020.
  9. Mind the gap! bridging the reality gap in visual perception and robotic grasping with domain randomisation. 2018.
  10. Closing the sim-to-real loop: Adapting simulation randomization with real world experience. In 2019 International Conference on Robotics and Automation (ICRA), pages 8973–8979. IEEE, 2019.
  11. Learning task space actions for bipedal locomotion. In 2021 IEEE International Conference on Robotics and Automation (ICRA), pages 1276–1282. IEEE, 2021.
  12. Implicit kinematic policies: Unifying joint and cartesian action spaces in end-to-end robot learning. In 2022 International Conference on Robotics and Automation (ICRA), pages 2656–2662. IEEE, 2022.
  13. Dextreme: Transfer of agile in-hand manipulation from simulation to reality. In 2023 IEEE International Conference on Robotics and Automation (ICRA), pages 5977–5984. IEEE, 2023.
  14. Augmenting differentiable simulators with neural networks to close the sim2real gap. ArXiv, abs/2007.06045, 2020.
  15. Learning agile and dynamic motor skills for legged robots. Science Robotics, 4(26):eaau5872, 2019.
  16. Learning force control policies for compliant manipulation. In 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 4639–4644. IEEE, 2011.
  17. A benchmark comparison of learned control policies for agile quadrotor flight. In 2022 International Conference on Robotics and Automation (ICRA), pages 10504–10510. IEEE, 2022.
  18. Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971, 2015.
  19. Reinforcement learning on variable impedance controller for high-precision robotic assembly. In 2019 International Conference on Robotics and Automation (ICRA), pages 3080–3087. IEEE, 2019.
  20. Isaac gym: High performance gpu-based physics simulation for robot learning. arXiv preprint arXiv:2108.10470, 2021.
  21. Variable impedance control in end-effector space: An action space for reinforcement learning in contact-rich tasks. In 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 1010–1017. IEEE, 2019.
  22. Xue Bin Peng and Michiel Van De Panne. Learning locomotion skills using deeprl: Does the choice of action space matter? In Proceedings of the ACM SIGGRAPH/Eurographics Symposium on Computer Animation, pages 1–13, 2017.
  23. Learning to walk in minutes using massively parallel deep reinforcement learning. ArXiv, abs/2109.11978, 2021.
  24. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017.
  25. A walk in the park: Learning to walk in 20 minutes with model-free reinforcement learning. arXiv preprint arXiv:2208.07860, 2022.
  26. Reinforcement learning: An introduction. MIT press, 2018.
  27. Sim-to-real: Learning agile locomotion for quadruped robots. Robotics: Science and Systems, 2018.
  28. Industreal: Transferring contact-rich assembly tasks from simulation to reality. arXiv preprint arXiv:2305.17110, 2023.
  29. Learning robotic manipulation skills using an adaptive force-impedance action space. arXiv preprint arXiv:2110.09904, 2021.
  30. A comparison of action spaces for learning manipulation tasks. In 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 6015–6021. IEEE, 2019.
  31. From pixels to torques: Policy learning with deep dynamical models. arXiv preprint arXiv:1502.02251, 2015.
  32. Learning locomotion skills for cassie: Iterative design and sim-to-real. In Conference on Robot Learning, pages 317–329. PMLR, 2020.
  33. Plas: Latent action space for offline reinforcement learning. In Conference on Robot Learning, 2020.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Elie Aljalbout (21 papers)
  2. Felix Frank (4 papers)
  3. Maximilian Karl (17 papers)
  4. Patrick van der Smagt (63 papers)
Citations (13)

Summary

Introduction

In robot manipulation learning, the particular control commands used—referred to as the "action space"—can greatly impact how robots learn tasks and how well these tasks transfer from simulated environments to the real world. Traditionally, robot learning policies directly handle low-level commands like joint torques, but more recent trends involve policies that output higher-level commands—like desired joint velocities—which are then translated into low-level commands by engineered controllers. This paper investigates how these different types of action spaces affect robot learning and simulation-to-reality transfer.

Methodology and Experiment Design

The authors conducted extensive experiments training over 250 reinforcement learning (RL) agents in simulated reaching and pushing tasks using 13 different control spaces. The action spaces evaluated range from traditional controls, such as joint torques (JT), to novel configurations that integrate embedded controllers converting policy directives into robot joint torques.

To compare these spaces, several metrics are used. For example, success rate transfer measures how successful a policy learned in simulation is when applied to the real world. The usability of behaviors looks at how often policies violate robot constraints like acceleration and jerk limits. Lastly, the researchers measure the gap introduced when transferring learned behaviors into a physical environment.

Results and Findings

Analysis of the training performance in the simulated environment showed that action spaces with higher-order control variables (like joint velocities) were more efficient. Delta action spaces, which adjust control targets based on current feedback, were also beneficial but required careful tuning of hyperparameters. In real-world deployment, joint velocity action spaces generally exhibited better transferability. Policies developed with these spaces approached tasks with higher accuracy and lower rates of constraint violation, translating to more reliable real-world performance.

Conclusion

The paper's findings underscore the importance of action space selection in RL for robotic manipulation. The research identifies joint velocity-based action spaces as the most effective for learning and transferring manipulation tasks, primarily due to their ability to more easily accommodate dynamic interactions and enforce smoothness constraints on control trajectories. These insights can guide future designs of RL algorithms and suggest that integrating feedback mechanisms may reduce the sim-to-real gap and improve overall performance in real-world robotic applications.

X Twitter Logo Streamline Icon: https://streamlinehq.com