6-DOF Grasping for Target-driven Object Manipulation in Clutter (1912.03628v2)

Published 8 Dec 2019 in cs.RO and cs.CV

Abstract: Grasping in cluttered environments is a fundamental but challenging robotic skill. It requires both reasoning about unseen object parts and potential collisions with the manipulator. Most existing data-driven approaches avoid this problem by limiting themselves to top-down planar grasps which is insufficient for many real-world scenarios and greatly limits possible grasps. We present a method that plans 6-DOF grasps for any desired object in a cluttered scene from partial point cloud observations. Our method achieves a grasp success of 80.3%, outperforming baseline approaches by 17.6% and clearing 9 cluttered table scenes (which contain 23 unknown objects and 51 picks in total) on a real robotic platform. By using our learned collision checking module, we can even reason about effective grasp sequences to retrieve objects that are not immediately accessible. Supplementary video can be found at https://youtu.be/w0B5S-gCsJk.

Authors (5)

Adithyavairavan Murali (13 papers)
Arsalan Mousavian (42 papers)
Clemens Eppner (18 papers)
Chris Paxton (59 papers)
Dieter Fox (201 papers)

Citations (188)

View on Semantic Scholar

Summary

Analysis of 6-DOF Grasping for Target-driven Object Manipulation in Clutter

This paper presents a methodological advancement in robotic grasping within cluttered environments by adopting a 6-degree-of-freedom (6-DOF) strategy for object manipulation. Traditional grasping methodologies primarily rely on top-down approaches, which constrain robotic efficacy in real-world applications where complex object interaction and manipulation are inevitably required. By contrast, this research introduces a robust framework for executing 6-DOF grasps, which not only expands the grasping versatility of robotic systems but significantly improves their operational accuracy in the presence of environmental occlusions and object interference.

Methodological Overview

The approach outlined integrates machine learning techniques to predict grasp poses from partial point cloud data. The research adopts a dual-stage framework:

Grasp Synthesis for Isolated Objects: The initial stage focuses on generating prospective grasp configurations using a conditional Variational Autoencoder (VAE). This module is trained to infer optimal grasp poses based on object-centric point clouds, derived from instance segmentation techniques.
Collision Prediction in Cluttered Scenes: The second component, CollisionNet, evaluates these proposed grasps against a larger scene context to predict potential collisions using partial point cloud observations. This module is discriminative and operates by conditioning on the gripper's geometric configuration, identifying which grasps would result in unobstructed execution.

The paper asserts a grasp success rate of 80.3%, achieved across 9 challenging, cluttered table scenes. The results significantly outperformed the baseline methods by 17.6%, demonstrating enhanced efficiency in mitigating the challenges posed by unseen object parts and surrounding interference.

Strong Numerical Results and Novel Contributions

The research emphasizes a few key contributions:

The proposed 6-DOF grasp synthesis leverages learned predictions of successful grasps and collides with scene-level collision reasoning.
The system only uses training data generated in simulated environments, yet translates effectively to real-world scenarios without additional on-site training.
An operational grasp accuracy of 80.3% on 23 previously unknown objects representing varied geometries showcases the model's adaptability and scalability.

Implications and Future Research Directions

From a practical perspective, the implications of this advancement are substantial for autonomous robotic systems operating in household or industrial environments cluttered with miscellaneous items. The method's capability to prioritize and sequence object retrieval without predefined object models also presents an opportunity for applications in unstructured environments where such information is unknown or dynamically changing.

Theoretically, this research influences the broader field of AI-driven robotic manipulation by reinforcing the potential of learning-based methods to resolve traditional navigation and perception challenges in real-time complex settings.

Future research could explore augmenting the current system with trajectory planning capabilities that factor in complete manipulator-arm interactions with the environment, extending beyond gripper-level collision checking. Furthermore, incorporation into task-executed planning frameworks would be beneficial, enabling robots to not only grasp but also sequence and execute tasks involving multiple objects or collaborative assembly requirements.

Overall, the research delineated in this paper significantly aids in imparting greater dexterity and autonomy to robotic workflows and broadens the horizon for navigating and manipulating in densely packed and variably structured environments.