Antipodal Robotic Grasping using Generative Residual Convolutional Neural Network (1909.04810v4)

Published 11 Sep 2019 in cs.RO and cs.CV

Abstract: In this paper, we present a modular robotic system to tackle the problem of generating and performing antipodal robotic grasps for unknown objects from n-channel image of the scene. We propose a novel Generative Residual Convolutional Neural Network (GR-ConvNet) model that can generate robust antipodal grasps from n-channel input at real-time speeds (~20ms). We evaluate the proposed model architecture on standard datasets and a diverse set of household objects. We achieved state-of-the-art accuracy of 97.7% and 94.6% on Cornell and Jacquard grasping datasets respectively. We also demonstrate a grasp success rate of 95.4% and 93% on household and adversarial objects respectively using a 7 DoF robotic arm.

Citations (253)

View on Semantic Scholar

Summary

The paper introduces a novel GR-ConvNet architecture that generates antipodal grasp configurations in real time with approximately 20ms inference.
It reports state-of-the-art accuracy of 97.7% on the Cornell dataset and 94.6% on the Jacquard dataset, outperforming previous models.
The modular system integrates ROS for flexible robotic deployment, demonstrating high success rates in cluttered and dynamic environments.

Antipodal Robotic Grasping using a Generative Residual Convolutional Neural Network

The paper "Antipodal Robotic Grasping using a Generative Residual Convolutional Neural Network" by Kumra, Joshi, and Sahin introduces an innovative modular system for robotic grasping of unknown objects through real-time image processing. The core of this system is the proposed Generative Residual Convolutional Neural Network (GR-ConvNet), which advances the current state of robotic manipulation by enabling robust and efficient grasp generation from n-channel image inputs. This approach is validated through rigorous testing on standard datasets and real-world grasping tasks, demonstrating superior performance in terms of accuracy and computational efficiency compared to existing models.

Key Contributions and Findings

GR-ConvNet Architecture: The paper introduces a novel neural network architecture capable of generating antipodal grasps by processing each image pixel into three output images that reflect grasp quality, angle, and width. This approach differentiates itself by generating multiple grasp configurations in real-time (approximately 20ms), supporting its application in dynamically changing environments.
Rigorous Evaluation: The effectiveness of GR-ConvNet was assessed on both the Cornell and Jacquard grasping datasets. This model showed state-of-the-art accuracy of 97.7% and 94.6%, respectively, outperforming previous architectures such as AlexNet and ResNet extensions. Furthermore, the model achieved high success rates of 95.4% on household objects and 93% on more challenging adversarial objects using a 7 DoF robotic arm.
Modular Implementation: The system is designed with a modular architecture comprising an inference module for grasp prediction and a control module for executing robot trajectories. This adaptability is enhanced by its integration with ROS, offering flexibility for deployment on different robotic platforms without extensive reconfiguration.
Efficiency in Real-World Applications: In practical scenarios, such as grasping from cluttered environments, GR-ConvNet maintained a 93.5% success rate, affirming its robustness across varied object settings. These attributes position GR-ConvNet as a viable candidate for real-time applications, including automated warehouses and supply chain robotics, where speed and accuracy are critical.

Theoretical and Practical Implications

The reported results and methodologies present significant advancements in the field of robotic grasping. The employment of generative techniques within a convolutional neural network framework offers a pathway for more granular and adaptive learning processes. Practically, the research addresses fundamental challenges in robotic manipulation by providing reliable grasp configurations for unfamiliar objects, enhancing the efficacy of robotic systems in unstructured and dynamic environments.

Theoretically, this research paves the way for further exploration into generative and residual models in robotics, investigating avenues such as adaptive learning for various gripper configurations and employing advanced inpainting methods to handle reflective materials.

Future Research Directions

This research opens several avenues for future exploration, including:

Extending GR-ConvNet to accommodate diverse gripper types beyond two-fingered models, potentially incorporating suction and multifingered options.
Incorporating advanced depth prediction to improve performance on transparent and reflective objects, which pose challenges to depth-based sensing.
Investigating closed-loop feedback mechanisms that integrate GR-ConvNet outputs, allowing for real-time adjustments in robotic trajectories to optimize grasp execution.

The integration of sophisticated neural network models into robotic systems presents transformative possibilities in automation, and GR-ConvNet exemplifies how these technologies can be effectively harnessed to address complex manipulation tasks.

PDF Markdown

Related Papers

YouTube

Show All Videos