Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
158 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

RESPRECT: Speeding-up Multi-fingered Grasping with Residual Reinforcement Learning (2401.14858v1)

Published 26 Jan 2024 in cs.RO

Abstract: Deep Reinforcement Learning (DRL) has proven effective in learning control policies using robotic grippers, but much less practical for solving the problem of grasping with dexterous hands -- especially on real robotic platforms -- due to the high dimensionality of the problem. In this work, we focus on the multi-fingered grasping task with the anthropomorphic hand of the iCub humanoid. We propose the RESidual learning with PREtrained CriTics (RESPRECT) method that, starting from a policy pre-trained on a large set of objects, can learn a residual policy to grasp a novel object in a fraction ($\sim 5 \times$ faster) of the timesteps required to train a policy from scratch, without requiring any task demonstration. To our knowledge, this is the first Residual Reinforcement Learning (RRL) approach that learns a residual policy on top of another policy pre-trained with DRL. We exploit some components of the pre-trained policy during residual learning that further speed-up the training. We benchmark our results in the iCub simulated environment, and we show that RESPRECT can be effectively used to learn a multi-fingered grasping policy on the real iCub robot. The code to reproduce the experiments is released together with the paper with an open source license.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (41)
  1. Learning dexterous in-hand manipulation. The International Journal of Robotics Research, 39(1):3–20, 2020.
  2. A system for general in-hand object re-orientation. In Conference on Robot Learning, pages 297–307. PMLR, 2022.
  3. Soft actor-critic algorithms and applications. arXiv preprint arXiv:1812.05905, 2018.
  4. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017.
  5. Dexpoint: Generalizable point cloud reinforcement learning for sim-to-real dexterous manipulation. In Conference on Robot Learning, pages 594–605. PMLR, 2023.
  6. The iCub humanoid robot: an open-systems platform for research in cognitive development. Neural networks : the official journal of the International Neural Network Society, 23(8-9):1125–34, 1 2010.
  7. A grasp pose is all you need: Learning multi-fingered grasping with deep reinforcement learning from vision and touch. In 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 2985–2992, 2023.
  8. A grasping approach based on superquadric models. In 2017 IEEE International Conference on Robotics and Automation (ICRA), pages 1579–1586, 2017.
  9. Toward human-like grasp: Functional grasp by dexterous robotic hand via object-hand semantic representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, pages 1–14, 2023.
  10. Efficientgrasp: A unified data-efficient learning to grasp method for multi-fingered robot hands. IEEE Robotics and Automation Letters, 7(4):8619–8626, 2022.
  11. Multifingered grasping based on multimodal reinforcement learning. IEEE Robotics and Automation Letters, 7(2):1174–1181, 2021.
  12. Leveraging demonstrations for deep reinforcement learning on robotics problems with sparse rewards. arXiv preprint arXiv:1707.08817, 2017.
  13. Overcoming exploration in reinforcement learning with demonstrations. In 2018 IEEE international conference on robotics and automation (ICRA), pages 6292–6299. IEEE, 2018.
  14. Continuous control with deep reinforcement learning. 2016.
  15. Learning complex dexterous manipulation with deep reinforcement learning and demonstrations. In Proceedings of Robotics: Science and Systems, Pittsburgh, Pennsylvania, June 2018.
  16. Awac: Accelerating online reinforcement learning with offline datasets. arXiv preprint arXiv:2006.09359, 2020.
  17. On the effectiveness of fine-tuning versus meta-reinforcement learning. Advances in Neural Information Processing Systems, 35:26519–26531, 2022.
  18. Efficient off-policy meta-reinforcement learning via probabilistic context variables. In International conference on machine learning, pages 5331–5340. PMLR, 2019.
  19. Model-agnostic meta-learning for fast adaptation of deep networks. In International conference on machine learning, pages 1126–1135. PMLR, 2017.
  20. Residual policy learning. arXiv preprint arXiv:1812.06298, 2018.
  21. Residual reinforcement learning for robot control. In 2019 International Conference on Robotics and Automation (ICRA), pages 6023–6029. IEEE, 2019.
  22. Deep reinforcement learning for industrial insertion tasks with visual inputs and natural rewards. In 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 5548–5555. IEEE, 2020.
  23. Residual feedback learning for contact-rich manipulation tasks with uncertainty. In 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 2383–2390. IEEE, 2021.
  24. Residual learning from demonstration: Adapting dmps for contact-rich manipulation. IEEE Robotics and Automation Letters, 7(2):4488–4495, 2022.
  25. Residual reinforcement learning from demonstrations. arXiv preprint arXiv:2106.08050, 2021.
  26. Tossingbot: Learning to throw arbitrary objects with residual physics. IEEE Transactions on Robotics, 36(4):1307–1319, 2020.
  27. Residual reactive navigation: Combining classical and learned navigation strategies for deployment in unknown environments. In 2020 IEEE International Conference on Robotics and Automation (ICRA), pages 11493–11499, 2020.
  28. Volumetric grasping network: Real-time 6 dof grasp detection in clutter. In Conference on Robot Learning, 2020.
  29. Reinforcement learning with latent flow. Advances in Neural Information Processing Systems, 34:22171–22183, 2021.
  30. Real-world robot learning with masked visual pre-training. In Conference on Robot Learning, pages 416–426. PMLR, 2023.
  31. Ego4d: Around the world in 3,000 hours of egocentric video. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 18995–19012, 2022.
  32. Google scanned objects: A high-quality dataset of 3d scanned household items, 2022.
  33. Kevin Zakka. Scanned Objects MuJoCo Models, 7 2022.
  34. Posecnn: A convolutional neural network for 6d object pose estimation in cluttered scenes. 2018.
  35. Mujoco: A physics engine for model-based control. In 2012 IEEE/RSJ international conference on intelligent robots and systems, pages 5026–5033. IEEE, 2012.
  36. Stable-baselines3: Reliable reinforcement learning implementations. Journal of Machine Learning Research, 22(268):1–8, 2021.
  37. Learning transferable visual models from natural language supervision. In International Conference on Machine Learning, pages 8748–8763. PMLR, 2021.
  38. YARP: Yet another robot platform. International Journal of Advanced Robotics Systems, 3(1):43–48, 2006.
  39. An experimental evaluation of a novel minimum-jerk cartesian controller for humanoid robots. In 2010 IEEE/RSJ international conference on intelligent robots and systems, pages 1668–1674. IEEE, 2010.
  40. On first-order meta-learning algorithms. arXiv preprint arXiv:1803.02999, 2018.
  41. Rlbench: The robot learning benchmark & learning environment. IEEE Robotics and Automation Letters, 5(2):3019–3026, 2020.
Citations (3)

Summary

We haven't generated a summary for this paper yet.