GraNet: A Multi-Level Graph Network for 6-DoF Grasp Pose Generation in Cluttered Scenes (2312.03345v1)
Abstract: 6-DoF object-agnostic grasping in unstructured environments is a critical yet challenging task in robotics. Most current works use non-optimized approaches to sample grasp locations and learn spatial features without concerning the grasping task. This paper proposes GraNet, a graph-based grasp pose generation framework that translates a point cloud scene into multi-level graphs and propagates features through graph neural networks. By building graphs at the scene level, object level, and grasp point level, GraNet enhances feature embedding at multiple scales while progressively converging to the ideal grasping locations by learning. Our pipeline can thus characterize the spatial distribution of grasps in cluttered scenes, leading to a higher rate of effective grasping. Furthermore, we enhance the representation ability of scalable graph networks by a structure-aware attention mechanism to exploit local relations in graphs. Our method achieves state-of-the-art performance on the large-scale GraspNet-1Billion benchmark, especially in grasping unseen objects (+11.62 AP). The real robot experiment shows a high success rate in grasping scattered objects, verifying the effectiveness of the proposed approach in unstructured environments.
- A. Zeng, K. T. Yu, S. Song, D. Suo, E. Walker, A. Rodriguez, and J. Xiao, “Multi-view self-supervised deep learning for 6D pose estimation in the Amazon Picking Challenge,” in IEEE International Conference on Robotics and Automation (ICRA), 2017, pp. 1386–1383.
- C. Wang, D. Xu, Y. Zhu, R. Martín-Martín, C. Lu, L. Fei-Fei, and S. Savarese, “DenseFusion: 6D Object Pose Estimation by Iterative Dense Fusion,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 3338–3347.
- L. Pinto and A. Gupta, “Supersizing self-supervision: Learning to grasp from 50K tries and 700 robot hours,” in IEEE International Conference on Robotics and Automation (ICRA), 2016, pp. 3406–3413.
- S. Kumra and C. Kanan, “Robotic grasp detection using deep convolutional neural networks,” in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2017, pp. 769–776.
- A. Mousavian, C. Eppner, and D. Fox, “6-DOF GraspNet: Variational Grasp Generation for Object Manipulation,” in IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 2901–2910.
- A. Murali, A. Mousavian, C. Eppner, C. Paxton, and D. Fox, “6-DOF Grasping for Target-driven Object Manipulation in Clutter,” in IEEE International Conference on Robotics and Automation (ICRA), 2020, pp. 6232–6238.
- X. Lou, Y. Yang, and C. Choi, “Learning to Generate 6-DoF Grasp Poses with Reachability Awareness,” in IEEE International Conference on Robotics and Automation (ICRA), 2020, pp. 1532–1538.
- H. Liang, X. Ma, S. Li, M. Görner, S. Tang, B. Fang, F. Sun, and J. Zhang, “PointNetGPD: Detecting Grasp Configurations from Point Sets,” in IEEE International Conference on Robotics and Automation (ICRA), 2019, pp. 3629–3635.
- P. Ni, W. Zhang, X. Zhu, and Q. Cao, “PointNet++ Grasping: Learning An End-to-end Spatial Grasp Generation Algorithm from Sparse Point Clouds,” in IEEE International Conference on Robotics and Automation (ICRA), 2020, pp. 3619–3625.
- Y. Qin, R. Chen, H. Zhu, M. Song, J. Xu, and H. Su, “S4G: Amodal Single-view Single-Shot SE(3) Grasp Detection in Cluttered Scenes,” in Conference on Robot Learning (CoRL), 2020, pp. 53–65.
- C. Wang, H.-S. Fang, M. Gou, H. Fang, J. Gao, and C. Lu, “Graspness Discovery in Clutters for Fast and Accurate Grasp Detection,” in IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 15 964–15 973.
- C. R. Qi, L. Yi, H. Su, and L. J. Guibas, “PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space,” in Advances in Neural Information Processing Systems (NeurIPS), 2017, pp. 5099–5108.
- H.-S. Fang, C. Wang, M. Gou, and C. Lu, “GraspNet-1Billion: A Large-Scale Benchmark for General Object Grasping,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 11 444–11 453.
- J. M. Wong, V. Kee, T. Le, S. Wagner, G.-L. Mariottini, A. Schneider, L. Hamilton, R. Chipalkatty, M. Hebert, D. M. Johnson, J. Wu, B. Zhou, and A. Torralba, “SegICP: Integrated deep semantic segmentation and pose estimation,” in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2017, pp. 5784–5789.
- B. Tekin, S. N. Sinha, and P. Fua, “Real-Time Seamless Single Shot 6D Object Pose Prediction,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 292–301.
- Z. Dong, S. Liu, T. Zhou, H. Cheng, L. Zeng, X. Yu, and H. Liu, “PPR-Net:Point-wise Pose Regression Network for Instance Segmentation and 6D Pose Estimation in Bin-picking Scenarios,” in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2019, pp. 1773–1780.
- M. Gou, H.-S. Fang, Z. Zhu, S. Xu, C. Wang, and C. Lu, “RGB Matters: Learning 7-DoF Grasp Poses on Monocular RGBD Images,” in IEEE International Conference on Robotics and Automation (ICRA), 2021, pp. 13 459–13 466.
- B. Wen, W. Lian, K. Bekris, and S. Schaal, “CaTGrasp: Learning Category-Level Task-Relevant Grasping in Clutter from Simulation,” in International Conference on Robotics and Automation (ICRA), 2022, pp. 6401–6408.
- Y. Jiang, S. Moseson, and A. Saxena, “Efficient grasping from RGBD images: Learning using a new rectangle representation,” in IEEE International Conference on Robotics and Automation (ICRA), 2011, pp. 3304–3311.
- J. Redmon and A. Angelova, “Real-time grasp detection using convolutional neural networks,” in IEEE International Conference on Robotics and Automation (ICRA), 2015, pp. 1316–1322.
- S. Wang, X. Jiang, J. Zhao, X. Wang, W. Zhou, and Y. Liu, “Efficient Fully Convolution Neural Network for Generating Pixel Wise Robotic Grasps With High Resolution Images,” in IEEE International Conference on Robotics and Biomimetics (ROBIO), 2019, pp. 474–480.
- S. Kumra, S. Joshi, and F. Sahin, “Antipodal Robotic Grasping using Generative Residual Convolutional Neural Network,” in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2020, pp. 9626–9633.
- A. Pas, M. Gualtieri, K. Saenko, and R. Platt, “Grasp Pose Detection in Point Clouds,” The International Journal of Robotics Research (IJRR), vol. 36, pp. 1455–1473, 2017.
- C. R. Qi, H. Su, K. Mo, and L. J. Guibas, “PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 652–660.
- M. Sundermeyer, A. Mousavian, R. Triebel, and D. Fox, “Contact-GraspNet: Efficient 6-DoF Grasp Generation in Cluttered Scenes,” in IEEE International Conference on Robotics and Automation (ICRA), 2021, pp. 13 438–13 444.
- A. Alliegro, M. Rudorfer, F. Frattin, A. Leonardis, and T. Tommasi, “End-to-End Learning to Grasp via Sampling From Object Point Clouds,” IEEE Robotics and Automation Letters, vol. 7, no. 4, pp. 9865–9872, 2022.
- B. Zhao, H. Zhang, X. Lan, H. Wang, Z. Tian, and N. Zheng, “REGNet: REgion-based Grasp Network for End-to-end Grasp Detection in Point Clouds,” in IEEE International Conference on Robotics and Automation (ICRA), 2021, pp. 13 474–13 480.
- Y. Lu, B. Deng, Z. Wang, P. Zhi, Y. Li, and S. Wang, “Hybrid Physical Metric For 6-DoF Grasp Pose Detection,” in IEEE International Conference on Robotics and Automation (ICRA), 2022, pp. 8238–8244.
- L. Tian, J. Wu, Z. Xiong, and X. Zhu, “Vote for Grasp Poses from Noisy Point Sets by Learning From Human,” in International Conference on Mechatronics and Machine Vision in Practice (M2VIP), 2021, pp. 349–356.
- Y. Li, T. Kong, R. Chu, Y. Li, P. Wang, and L. Li, “Simultaneous Semantic and Collision Learning for 6-DoF Grasp Pose Estimation,” in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2021, pp. 3571–3578.
- Z. Liu, Z. Chen, S. Xie, and W. Zheng, “TransGrasp: A Multi-Scale Hierarchical Point Transformer for 7-DoF Grasp Detection,” in IEEE International Conference on Robotics and Automation (ICRA), 2022, pp. 1533–1539.
- Z. Wu, S. Pan, F. Chen, G. Long, C. Zhang, and P. S. Yu, “A Comprehensive Survey on Graph Neural Networks,” IEEE Transactions on Neural Networks and Learning Systems, vol. 32, no. 1, pp. 4–24, 2021.
- L. Yao, C. Mao, and Y. Luo, “Graph Convolutional Networks for Text Classification,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, 2019, pp. 7370–7377.
- T. Bian, X. Xiao, T. Xu, P. Zhao, W. Huang, Y. Rong, and J. Huang, “Rumor Detection on Social Media with Bi-Directional Graph Convolutional Networks,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, 2020, pp. 549–556.
- A. Garcia-Garcia, B. S. Zapata-Impata, S. Orts-Escolano, P. Gil, and J. Garcia-Rodriguez, “TactileGCN: A Graph Convolutional Network for Predicting Grasp Stability with Tactile Sensors,” in International Joint Conference on Neural Networks (IJCNN), 2019, pp. 1–8.
- S. Funabashi, T. Isobe, F. Hongyi, A. Hiramoto, A. Schmitz, S. Sugano, and T. Ogata, “Multi-Fingered In-Hand Manipulation With Various Object Properties Using Graph Convolutional Networks and Distributed Tactile Sensors,” IEEE Robotics and Automation Letters, vol. 7, no. 2, pp. 2102–2109, 2022.
- A. Iriondo, E. Lazkano, and A. Ansuategi, “Affordance-Based Grasping Point Detection Using Graph Convolutional Networks for Industrial Bin-Picking Applications,” Sensors, vol. 21, no. 3, p. 816, 2021.
- X. Lou, Y. Yang, and C. Choi, “Learning Object Relations with Graph Neural Networks for Target-Driven Grasping in Dense Clutter,” in IEEE International Conference on Robotics and Automation (ICRA), 2022, pp. 742–748.
- Y. Wang, Y. Sun, Z. Liu, S. E. Sarma, M. M. Bronstein, and J. M. Solomon, “Dynamic Graph CNN for Learning on Point Clouds,” ACM Transactions on Graphics, vol. 38, no. 5, pp. 146:1–146:12, 2019.
- F. Frasca, E. Rossi, D. Eynard, B. Chamberlain, M. Bronstein, and F. Monti, “SIGN: Scalable Inception Graph Neural Networks,” in ICML 2020 Workshop on Graph Representation Learning and Beyond, 2020.
- Y. Li, D. Tarlow, M. Brockschmidt, and R. Zemel, “Gated Graph Sequence Neural Networks,” arXiv preprint arXiv:1511.05493, 2017.
- D. Morrison, J. Leitner, and P. Corke, “Closing the Loop for Robotic Grasping: A Real-time, Generative Grasp Synthesis Approach,” in Robotics: Science and Systems (RSS), 2018.
- F. J. Chu, R. Xu, and P. A. Vela, “Real-World Multiobject, Multigrasp Detection,” IEEE Robotics and Automation Letters, vol. 3, no. 4, pp. 3355–3362, 2018.