Contact Energy Based Hindsight Experience Prioritization (2312.02677v2)
Abstract: Multi-goal robot manipulation tasks with sparse rewards are difficult for reinforcement learning (RL) algorithms due to the inefficiency in collecting successful experiences. Recent algorithms such as Hindsight Experience Replay (HER) expedite learning by taking advantage of failed trajectories and replacing the desired goal with one of the achieved states so that any failed trajectory can be utilized as a contribution to learning. However, HER uniformly chooses failed trajectories, without taking into account which ones might be the most valuable for learning. In this paper, we address this problem and propose a novel approach Contact Energy Based Prioritization~(CEBP) to select the samples from the replay buffer based on rich information due to contact, leveraging the touch sensors in the gripper of the robot and object displacement. Our prioritization scheme favors sampling of contact-rich experiences, which are arguably the ones providing the largest amount of information. We evaluate our proposed approach on various sparse reward robotic tasks and compare them with the state-of-the-art methods. We show that our method surpasses or performs on par with those methods on robot manipulation tasks. Finally, we deploy the trained policy from our method to a real Franka robot for a pick-and-place task. We observe that the robot can solve the task successfully. The videos and code are publicly available at: https://erdiphd.github.io/HER_force
- L. P. Kaelbling, M. L. Littman, and A. W. Moore, “Reinforcement learning: A survey,” Journal of artificial intelligence research, 1996.
- V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. Riedmiller, “Playing atari with deep reinforcement learning,” arXiv preprint arXiv:1312.5602, 2013.
- V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, et al., “Human-level control through deep reinforcement learning,” nature, 2015.
- D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, et al., “Mastering the game of go with deep neural networks and tree search,” nature, 2016.
- A. Y. Ng, H. J. Kim, M. I. Jordan, S. Sastry, and S. Ballianda, “Autonomous helicopter flight via reinforcement learning.” in NIPS, vol. 16. Citeseer, 2003.
- A. Y. Ng, A. Coates, M. Diel, V. Ganapathi, J. Schulte, B. Tse, E. Berger, and E. Liang, “Autonomous inverted helicopter flight via reinforcement learning.” Springer, 2006.
- T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra, “Continuous control with deep reinforcement learning,” 2019.
- A. Y. Ng, D. Harada, and S. Russell, “Policy invariance under reward transformations: Theory and application to reward shaping,” in ICML, 1999.
- M. Andrychowicz, F. Wolski, A. Ray, J. Schneider, R. Fong, P. Welinder, B. McGrew, J. Tobin, P. Abbeel, and W. Zaremba, “Hindsight experience replay,” CoRR, vol. abs/1707.01495, 2017. [Online]. Available: http://arxiv.org/abs/1707.01495
- T. Schaul, J. Quan, I. Antonoglou, and D. Silver, “Prioritized experience replay,” arXiv preprint arXiv:1511.05952, 2015.
- R. Zhao, X. Sun, and V. Tresp, “Maximum entropy-regularized multi-goal reinforcement learning,” in International Conference on Machine Learning, 2019, pp. 7553–7562.
- M. Plappert, M. Andrychowicz, A. Ray, B. McGrew, B. Baker, G. Powell, J. Schneider, J. Tobin, M. Chociej, P. Welinder, et al., “Multi-goal reinforcement learning: Challenging robotics environments and request for research,” arXiv preprint arXiv:1802.09464, 2018.
- H. Merzic, M. Bogdanovic, D. Kappler, L. Righetti, and J. Bohg, “Leveraging contact forces for learning to grasp,” 2018.
- C. Wang, S. Wang, B. Romero, F. Veiga, and E. Adelson, “Swingbot: Learning physical features from in-hand tactile exploration for dynamic swing-up manipulation,” 2021.
- A. Melnik, L. Lach, M. Plappert, T. Korthals, R. Haschke, and H. Ritter, “Tactile sensing and deep reinforcement learning for in-hand manipulation tasks,” in IROS Workshop on Autonomous Object Manipulation, 2019.
- “Improved learning of robot manipulation tasks via tactile intrinsic motivation,” IEEE Robotics and Automation Letters, 2021.
- T. Li, H. Luo, L. Qin, X. Wang, Z. Xiong, H. Ding, Y. Gu, Z. Liu, and T. Zhang, “Flexible capacitive tactile sensor based on micropatterned dielectric layer,” Small, vol. 12, no. 36, pp. 5042–5048, 2016.
- K. Weiß and H. Worn, “The working principle of resistive tactile sensor cells,” in IEEE International Conference Mechatronics and Automation, 2005, vol. 1. IEEE, 2005, pp. 471–476.
- E. Donlon, S. Dong, M. Liu, J. Li, E. Adelson, and A. Rodriguez, “Gelslim: A high-resolution, compact, robust, and calibrated tactile-sensing finger,” in 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2018, pp. 1927–1934.
- M. Kaboli, K. Yao, D. Feng, and G. Cheng, “Tactile-based active object discrimination and target object search in an unknown workspace,” Autonomous Robots, vol. 43, no. 1, pp. 123–152, 2019.
- J. W. James, N. Pestell, and N. F. Lepora, “Slip detection with a biomimetic tactile sensor,” Robotics and Automation Letters, 2018.
- J. Bimbo, S. Luo, K. Althoefer, and H. Liu, “In-hand object pose estimation using covariance-based tactile to geometry matching,” IEEE Robotics and Automation Letters, vol. 1, no. 1, pp. 570–577, 2016.
- J. M. Romano, K. Hsiao, G. Niemeyer, S. Chitta, and K. J. Kuchenbecker, “Human-inspired robotic grasp control with tactile sensing,” IEEE Transactions on Robotics, vol. 27, no. 6, pp. 1067–1079, 2011.
- M. Kaboli, “New methods for active tactile object perception and learning with artificial robotic skin,” Dissertation, Technische Universität München, München, 2017.
- H. Van Hoof, T. Hermans, G. Neumann, and J. Peters, “Learning robot in-hand manipulation with tactile features,” in IEEE-RAS International Conference on Humanoid Robots (Humanoids), 2015.
- A. Koenig, Z. Liu, L. Janson, and R. Howe, “Tactile grasp refinement using deep reinforcement learning and analytic grasp stability metrics,” 2021.
- C. D’Eramo and G. Chalvatzaki, “Prioritized sampling with intrinsic motivation in multi-task reinforcement learning,” in 2022 International Joint Conference on Neural Networks (IJCNN). IEEE, 2022, pp. 1–8.
- T. Schaul, D. Horgan, K. Gregor, and D. Silver, “Universal value function approximators,” in ICML, 2015.
- T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra, “Continuous control with deep reinforcement learning,” arXiv preprint arXiv:1509.02971, 2015.
- G. Brockman, V. Cheung, L. Pettersson, J. Schneider, J. Schulman, J. Tang, and W. Zaremba, “Openai gym,” 2016.
- E. Todorov, T. Erez, and Y. Tassa, “Mujoco: A physics engine for model-based control,” in 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, 2012, pp. 5026–5033.
- M. Plappert, M. Andrychowicz, A. Ray, B. McGrew, B. Baker, G. Powell, J. Schneider, J. Tobin, M. Chociej, P. Welinder, V. Kumar, and W. Zaremba, “Multi-goal reinforcement learning: Challenging robotics environments and request for research,” 2018.
- Erdi Sayar (1 paper)
- Zhenshan Bing (39 papers)
- Carlo D'Eramo (28 papers)
- Ozgur S. Oguz (19 papers)
- Alois Knoll (190 papers)