Closing the Loop for Robotic Grasping: A Real-time, Generative Grasp Synthesis Approach (1804.05172v2)

Published 14 Apr 2018 in cs.RO

Abstract: This paper presents a real-time, object-independent grasp synthesis method which can be used for closed-loop grasping. Our proposed Generative Grasping Convolutional Neural Network (GG-CNN) predicts the quality and pose of grasps at every pixel. This one-to-one mapping from a depth image overcomes limitations of current deep-learning grasping techniques by avoiding discrete sampling of grasp candidates and long computation times. Additionally, our GG-CNN is orders of magnitude smaller while detecting stable grasps with equivalent performance to current state-of-the-art techniques. The light-weight and single-pass generative nature of our GG-CNN allows for closed-loop control at up to 50Hz, enabling accurate grasping in non-static environments where objects move and in the presence of robot control inaccuracies. In our real-world tests, we achieve an 83% grasp success rate on a set of previously unseen objects with adversarial geometry and 88% on a set of household objects that are moved during the grasp attempt. We also achieve 81% accuracy when grasping in dynamic clutter.

Citations (513)

View on Semantic Scholar

Summary

The paper introduces the GG-CNN, a novel lightweight model that predicts pixel-wise grasp quality and pose for efficient closed-loop control.
It employs a generative approach to reduce computational overhead, enabling real-time grasp synthesis at 50Hz.
Experimental results show robust grasp success rates of up to 88% on household items and 81% in cluttered, dynamic environments.

Overview of "Closing the Loop for Robotic Grasping: A Real-time, Generative Grasp Synthesis Approach"

The paper presents an innovative approach to robotic grasp synthesis, introducing the Generative Grasping Convolutional Neural Network (GG-CNN), which facilitates effective, real-time, closed-loop grasping. This work, focusing on object-independent grasp synthesis, addresses the limitations of existing deep learning methods—namely, the inefficiencies of grasp candidate sampling and extended computation times.

Key Contributions

The GG-CNN model is a lightweight neural network architecture that predicts the quality and pose of grasps for every pixel in a depth image. Its generative nature contrasts with traditional methods that classify sampled grasp candidates, offering a more direct and efficient grasp pose generation. The model's reduced complexity allows for execution at 50Hz, facilitating the closed-loop control necessary for handling dynamic environments and robot control inaccuracies.

Methodology

The GG-CNN generates antipodal grasps by estimating grasp quality, angle, and width for each pixel in the input image. It is trained on the Cornell Grasping Dataset and achieves this prediction in a fraction of a second, with only 62k network parameters. The network was rigorously evaluated in a series of experiments incorporating both static and dynamic object setups.

Experimental Evaluation

The experiments were conducted using a Kinova Mico robot equipped with a RealSense SR300 camera. The evaluation included two object sets: adversarially-shaped 3D-printed objects and a diverse array of household items. The GG-CNN achieved impressive grasp success rates, with 83% on the adversarial set and 88% on household items under dynamic conditions. Additionally, the system successfully grasped objects in cluttered environments with 81% accuracy, showcasing robustness against movement during grasp attempts.

Comparative Analysis

The paper compares the GG-CNN with several state-of-the-art grasp synthesis approaches. The results demonstrate competitive grasp success rates while significantly reducing computational overhead. Unlike large models with millions of parameters, GG-CNN's compact architecture allows for closed-loop operation, a valuable trait in non-static environments.

Implications and Future Directions

This research advances practical applications in robotics, particularly where real-time reaction to environmental changes is crucial. The efficiency and speed of GG-CNN pave the way for integration into various robotic systems, enhancing their ability to operate autonomously in unstructured settings. Future work could explore enhancements in perception capabilities, further robustness to feedback errors, and expanded training datasets incorporating a wider range of object geometries and complexities.

The GG-CNN serves as a promising foundation for future developments in robotic vision and manipulation, with potential contributions to fields such as robotics for unpredictable environments and flexible manufacturing systems.

PDF Markdown

Related Papers

YouTube

Show All Videos