GraNet: A Multi-Level Graph Network for 6-DoF Grasp Pose Generation in Cluttered Scenes (2312.03345v1)

Published 6 Dec 2023 in cs.RO and cs.CV

Abstract: 6-DoF object-agnostic grasping in unstructured environments is a critical yet challenging task in robotics. Most current works use non-optimized approaches to sample grasp locations and learn spatial features without concerning the grasping task. This paper proposes GraNet, a graph-based grasp pose generation framework that translates a point cloud scene into multi-level graphs and propagates features through graph neural networks. By building graphs at the scene level, object level, and grasp point level, GraNet enhances feature embedding at multiple scales while progressively converging to the ideal grasping locations by learning. Our pipeline can thus characterize the spatial distribution of grasps in cluttered scenes, leading to a higher rate of effective grasping. Furthermore, we enhance the representation ability of scalable graph networks by a structure-aware attention mechanism to exploit local relations in graphs. Our method achieves state-of-the-art performance on the large-scale GraspNet-1Billion benchmark, especially in grasping unseen objects (+11.62 AP). The real robot experiment shows a high success rate in grasping scattered objects, verifying the effectiveness of the proposed approach in unstructured environments.

References (43)

Citations (4)

View on Semantic Scholar

Summary

The paper introduces a multi-level graph network that significantly enhances 6-DoF grasp pose generation in complex, cluttered environments.
It employs graph neural networks on 3D point clouds to capture spatial features across scene, object, and grasp levels.
The paper validates GraNet with extensive experiments on the GraspNet-1Billion dataset and real robot trials, demonstrating state-of-the-art grasp success rates.

Introduction to GraNet

The development of robotic systems capable of grasping objects in cluttered environments with high degrees of freedom (6-DoF) remains a significant challenge in the field of robotics. Many existing methodologies fall short in terms of either adaptability or generalization, particularly when dealing with unknown objects and complex scenarios.

Graph Network Approach

A paper introduces GraNet, a multi-level graph network aiming to advance the current state of 6-DoF grasp pose generation. This innovative approach operates on point clouds, 3D representations of a scene, to determine optimal grasping poses for robotic manipulators. By constructing multi-level graphs that encapsulate scene, object, and grasp levels, GraNet can focus on ideal grasping locations through feature propagation using graph neural networks (GNNs). The hierarchical structure with cascading learning effects significantly improves the spatial feature recognition, leading to a noticeable increase in effective grasping rates in cluttered scenarios. Graph-based strategies are advantageous because they focus on the relationships between data points, thus enhancing the understanding of geometric structures crucial for grasp prediction.

GraspNet-1Billion and Performance

The effectiveness of GraNet is demonstrated on the extensive GraspNet-1Billion dataset, which contains over a billion annotated grasp poses. When put to the test, the model showcases state-of-the-art performance, particularly when identifying grasping opportunities for previously unseen objects. The network structure utilizes local attentiveness in conjunction with multi-hop connectivity, emphasizing the importance of nearby multi-hop information, and reducing redundancy by avoiding non-grasping locations.

Implementation and Real Robot Experimentation

The implementation details of the network reveal its complexity, incorporating several components such as graph feature embedding networks and learning-based grasp point selection mechanisms. The experimentation extends beyond simulations to real-world robotic trials, where GraNet was tasked with grasping novel items. The success rates reported underscore the network's robustness and its potential for practical applications in automated assembly lines, warehouse sorting, and other scenarios where robots interact with a diverse range of objects in disordered states.

Conclusion

The paper concludes by affirming that GraNet's multi-level graph-based network can effectively interpret complex scenes and select the most appropriate grasping points without prior knowledge of the objects. This marks a significant step forward in robotic grasping technology, paving the way for more efficient and adaptable autonomous systems. Provided with additional support and development, such systems could revolutionize various industries, enhancing product handling, and delivery processes. The integration of the grasping task into the feature extraction network emerges as a particularly powerful tactic, reflecting a promising direction for future research in robotic manipulation.

PDF Markdown