Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Symmetry-aware Reinforcement Learning for Robotic Assembly under Partial Observability with a Soft Wrist (2402.18002v2)

Published 28 Feb 2024 in cs.RO, cs.AI, and cs.LG

Abstract: This study tackles the representative yet challenging contact-rich peg-in-hole task of robotic assembly, using a soft wrist that can operate more safely and tolerate lower-frequency control signals than a rigid one. Previous studies often use a fully observable formulation, requiring external setups or estimators for the peg-to-hole pose. In contrast, we use a partially observable formulation and deep reinforcement learning from demonstrations to learn a memory-based agent that acts purely on haptic and proprioceptive signals. Moreover, previous works do not incorporate potential domain symmetry and thus must search for solutions in a bigger space. Instead, we propose to leverage the symmetry for sample efficiency by augmenting the training data and constructing auxiliary losses to force the agent to adhere to the symmetry. Results in simulation with five different symmetric peg shapes show that our proposed agent can be comparable to or even outperform a state-based agent. In particular, the sample efficiency also allows us to learn directly on the real robot within 3 hours.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (41)
  1. Y. Jiang, Z. Huang, B. Yang, and W. Yang, “A review of robotic assembly strategies for the full operation procedure: planning, execution and evaluation,” Robotics and Computer-Integrated Manufacturing, vol. 78, p. 102366, 2022.
  2. J. Xu, Z. Hou, Z. Liu, and H. Qiao, “Compare contact model-based control and contact model-free learning: A survey of robotic peg-in-hole assembly strategies,” arXiv preprint arXiv:1904.05240, 2019.
  3. O. Azulay, M. Monastirsky, and A. Sintov, “Haptic-based and s⁢e⁢(3)𝑠𝑒3se(3)italic_s italic_e ( 3 )-aware object insertion using compliant hands,” IEEE Robotics and Automation Letters, vol. 8, no. 1, pp. 208–215, 2022.
  4. S. Brahmbhatt, A. Deka, A. Spielberg, and M. Müller, “Zero-shot transfer of haptics-based object insertion policies,” in IEEE International Conference on Robotics and Automation, 2023, pp. 3940–3947.
  5. R. M. Hartisch and K. Haninger, “Compliant finray-effect gripper for high-speed robotic assembly of electrical components,” in IEEE/ASME International Conference on Advanced Intelligent Mechatronics, 2023.
  6. O. Yasa, Y. Toshimitsu, M. Y. Michelis, L. S. Jones, M. Filippi, T. Buchner, and R. K. Katzschmann, “An overview of soft robotics,” Annual Review of Control, Robotics, and Autonomous Systems, vol. 6, pp. 1–29, 2023.
  7. L. P. Kaelbling, M. L. Littman, and A. R. Cassandra, “Planning and acting in partially observable stochastic domains,” Artificial Intelligence, vol. 101, no. 1-2, pp. 99–134, 1998.
  8. E. Todorov, T. Erez, and Y. Tassa, “Mujoco: A physics engine for model-based control,” in IEEE/RSJ International Conference on Intelligent Robots and Systems, 2012, pp. 5026–5033.
  9. D. Bruder, C. D. Remy, and R. Vasudevan, “Nonlinear system identification of soft robot dynamics using koopman operator theory,” in IEEE International Conference on Robotics and Automation, 2019, pp. 6244–6250.
  10. J. M. Bern, Y. Schnider, P. Banzet, N. Kumar, and S. Coros, “Soft robot control with a learned differentiable model,” in IEEE International Conference on Soft Robotics, 2020, pp. 417–423.
  11. M. Hamaya, R. Lee, K. Tanaka, F. von Drigalski, C. Nakashima, Y. Shibata, and Y. Ijiri, “Learning robotic assembly tasks with lower dimensional systems by leveraging physical softness and environmental constraints,” in IEEE International Conference on Robotics and Automation, 2020, pp. 7747–7753.
  12. M. Hamaya, F. von Drigalski, T. Matsubara, K. Tanaka, R. Lee, C. Nakashima, Y. Shibata, and Y. Ijiri, “Learning soft robotic assembly strategies from successful and failed demonstrations,” in IEEE/RSJ International Conference on Intelligent Robots and Systems, 2020, pp. 8309–8315.
  13. T. G. Thuruthel, B. Shih, C. Laschi, and M. T. Tolley, “Soft robot perception using embedded soft sensors and recurrent neural networks,” Science Robotics, vol. 4, no. 26, p. eaav1488, 2019.
  14. T. Kawase, T. Miyazaki, T. Kanno, K. Tadano, Y. Nakajima, and K. Kawashima, “Pneumatic reservoir computing for sensing soft body: Computational ability of air in tube and its application to posture estimation of soft exoskeleton.” Sensors & Materials, vol. 33, 2021.
  15. S. E. Navarro, S. Nagels, H. Alagi, L.-M. Faller, O. Goury, T. Morales-Bieze, H. Zangl, B. Hein, R. Ramakers, W. Deferme, et al., “A model-based sensor fusion approach for force and shape estimation in soft robotics,” IEEE Robotics and Automation Letters, vol. 5, no. 4, pp. 5621–5628, 2020.
  16. J. Y. Loo, Z. Y. Ding, V. M. Baskaran, S. G. Nurzaman, and C. P. Tan, “Robust multimodal indirect sensing for soft robots via neural network-aided filter-based estimation,” Soft Robotics, vol. 9, no. 3, pp. 591–612, 2022.
  17. R. Takano, H. Mochiyama, and N. Takesue, “Real-time shape estimation of kirchhoff elastic rod based on force/torque sensor,” in IEEE International Conference on Robotics and Automation, 2017, pp. 2508–2515.
  18. M. A. Lee, Y. Zhu, K. Srinivasan, P. Shah, S. Savarese, L. Fei-Fei, A. Garg, and J. Bohg, “Making sense of vision and touch: Self-supervised learning of multimodal representations for contact-rich tasks,” in IEEE International Conference on Robotics and Automation, 2019, pp. 8943–8950.
  19. C. C. Beltran-Hernandez, D. Petit, I. G. Ramirez-Alpizar, and K. Harada, “Variable compliance control for robotic peg-in-hole assembly: A deep-reinforcement-learning approach,” Applied Sciences, vol. 10, no. 19, p. 6923, 2020.
  20. J. Ding, C. Wang, and C. Lu, “Transferable force-torque dynamics model for peg-in-hole task,” arXiv preprint arXiv:1912.00260, 2019.
  21. E. Van der Pol, D. Worrall, H. van Hoof, F. Oliehoek, and M. Welling, “Mdp homomorphic networks: Group symmetries in reinforcement learning,” Advances in Neural Information Processing Systems, vol. 33, pp. 4199–4210, 2020.
  22. D. Wang, R. Walters, and R. Platt, “So (2) equivariant reinforcement learning,” in International Conference on Learning Representations, 2022.
  23. X. Zhu, D. Wang, O. Biza, G. Su, R. Walters, and R. Platt, “Sample efficient grasp learning using equivariant models,” in Robotics: Science and Systems, 2022.
  24. J.-e. Lee, J. Lee, T. Bandyopadhyay, and L. Sentis, “Sample efficient dynamics learning for symmetrical legged robots: Leveraging physics invariance and geometric symmetries,” in IEEE International Conference on Robotics and Automation, 2023, pp. 2995–3001.
  25. Y. Li, C. Pan, H. Xu, X. Wang, and Y. Wu, “Efficient bimanual handover and rearrangement via symmetry-aware actor-critic learning,” in IEEE International Conference on Robotics and Automation, 2023, pp. 3867–3874.
  26. H. Nguyen, A. Baisero, D. Klee, D. Wang, R. Platt, and C. Amato, “Equivariant reinforcement learning under partial observability,” in Conference on Robot Learning.   PMLR, 2023, pp. 3309–3320.
  27. M. Weiler and G. Cesa, “General e (2)-equivariant steerable cnns,” Advances in neural information processing systems, vol. 32, 2019.
  28. T. Haarnoja, A. Zhou, P. Abbeel, and S. Levine, “Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor,” in International Conference on Machine Learning, 2018, pp. 1861–1870.
  29. H. Nguyen, B. Daley, X. Song, C. Amato, and R. Platt, “Belief-grounded networks for accelerated robot learning under partial observability,” in Conference on Robot Learning, vol. 155, 2021, pp. 1640–1653.
  30. A. Baisero and C. Amato, “Learning complementary representations of the past using auxiliary tasks in partially observable reinforcement learning,” in International Conference on Autonomous Agents and MultiAgent Systems, 2020, pp. 1762–1764.
  31. K. J. Astrom, “Optimal control of markov decision processes with incomplete state estimation,” Journal of Mathematical Analysis and Applications, vol. 10, pp. 174–205, 1965.
  32. S. P. Singh, T. Jaakkola, and M. I. Jordan, “Learning without state-estimation in partially observable markovian decision processes,” in Machine Learning Proceedings 1994, 1994, pp. 284–292.
  33. H. Qiao, B. Dalay, and R. Parkin, “Robotic peg-hole insertion operations using a six-component force sensor,” Journal of Mechanical Engineering Science, vol. 207, no. 5, pp. 289–306, 1993.
  34. T. Tang, H.-C. Lin, Y. Zhao, W. Chen, and M. Tomizuka, “Autonomous alignment of peg and hole by force/torque measurement for robotic assembly,” in IEEE International Conference on Automation Science and Engineering, 2016, pp. 162–167.
  35. T. Ni, B. Eysenbach, and R. Salakhutdinov, “Recurrent model-free RL can be a strong baseline for many POMDPs,” in International Conference on Machine Learning, 2022, pp. 16 691–16 723.
  36. H. Nguyen, A. Baisero, D. Wang, C. Amato, and R. Platt, “Leveraging fully observable policies for learning under partial observability,” in Conference on Robot Learning, 2023, pp. 1673–1683.
  37. Y. Zhu, J. Wong, A. Mandlekar, R. Martín-Martín, A. Joshi, S. Nasiriany, and Y. Zhu, “robosuite: A modular simulation framework and benchmark for robot learning,” in arXiv preprint arXiv:2009.12293, 2020.
  38. A. Walsman, M. Zhang, S. Choudhury, D. Fox, and A. Farhadi, “Impossibly good experts and how to follow them,” in International Conference on Learning Representations, 2022.
  39. F. von Drigalski, K. Tanaka, M. Hamaya, R. Lee, C. Nakashima, Y. Shibata, and Y. Ijiri, “A compact, cable-driven, activatable soft wrist with six degrees of freedom for assembly tasks,” in IEEE/RSJ International Conference on Intelligent Robots and Systems, 2020, pp. 8752–8757.
  40. D. Wang, J. Y. Park, N. Sortur, L. L. Wong, R. Walters, and R. Platt, “The surprising effectiveness of equivariant models in domains with latent symmetry,” in International Conference on Learning Representations, 2023.
  41. D. Wang, X. Zhu, J. Y. Park, M. Jia, G. Su, R. Platt, and R. Walters, “A general theory of correct, incorrect, and extrinsic equivariance,” Advances in Neural Information Processing Systems, vol. 36, 2024.
Citations (4)

Summary

We haven't generated a summary for this paper yet.

Youtube Logo Streamline Icon: https://streamlinehq.com