DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset (2403.12945v1)

Published 19 Mar 2024 in cs.RO

Abstract: The creation of large, diverse, high-quality robot manipulation datasets is an important stepping stone on the path toward more capable and robust robotic manipulation policies. However, creating such datasets is challenging: collecting robot manipulation data in diverse environments poses logistical and safety challenges and requires substantial investments in hardware and human labour. As a result, even the most general robot manipulation policies today are mostly trained on data collected in a small number of environments with limited scene and task diversity. In this work, we introduce DROID (Distributed Robot Interaction Dataset), a diverse robot manipulation dataset with 76k demonstration trajectories or 350 hours of interaction data, collected across 564 scenes and 84 tasks by 50 data collectors in North America, Asia, and Europe over the course of 12 months. We demonstrate that training with DROID leads to policies with higher performance and improved generalization ability. We open source the full dataset, policy learning code, and a detailed guide for reproducing our robot hardware setup.

PDF HTML Abstract

DROID: Unveiling a Large-Scale In-The-Wild Robot Manipulation Dataset

Introduction to DROID

The DROID (Distributed Robot Interaction Dataset) initiative represents a significant leap forward in the development of large-scale, diverse datasets aimed at advancing robotic manipulation research. This dataset encompasses 76k demonstration trajectories, or 350 hours of interaction data, collected across 564 scenes and 86 tasks by 50 data collectors in various geographical locations including North America, Asia, and Europe, over a period of 12 months. Each data entry in DROID is enriched with synchronized RGB camera streams, camera calibration data, depth information, and natural language instructions, providing a comprehensive resource for developing and testing robotic manipulation policies.

Dataset Collection and Composition

The DROID dataset is the product of a collaborative effort involving 13 institutions, using a standardized robotics hardware setup based on the Franka Panda robot arm. The consistency in hardware across diverse collection sites has been instrumental in accumulating a large and varied dataset. Protocols for data collection were meticulously designed to enhance diversity, encourage the registration of diverse tasks, and facilitate the easy adjustment of scenes. Upon completion, all trajectories were subject to post-processing, including natural language annotation via crowdsourcing, ensuring high-quality and usable data for research purposes.

Overview of Data Diversity

A distinguishing feature of the DROID dataset is its unparalleled diversity across multiple dimensions crucial for robotic manipulation research: tasks, objects, scenes, viewpoints, and interaction points. The dataset exhibits a wide range of manipulation tasks, as denoted by the varied verb usage in the instructions accompanying the trajectories. It also features interactions with a broad array of everyday objects across numerous scenes that include industrial and home environments, offices, kitchens, dining and bedrooms, among others. Furthermore, the data spans a multitude of camera viewpoints and interaction locations, providing a rich resource for developing policies capable of generalizing across different settings.

Experimental Validation

Experimental evaluation of policies trained using the DROID dataset demonstrates significant improvements in performance and robustness across a variety of tasks and conditions. Policies co-trained with DROID outperform those trained solely on in-domain data or with other existing large-scale datasets. The diversity inherent in DROID, especially in terms of scene variety, plays a crucial role in these advancements. Tasks for evaluation were selected to cover a wide spectrum of real robot usage scenarios, ranging from simple manipulation tasks to more complex, multi-step processes. The resultant policies showed marked improvements in handling both in-distribution and out-of-distribution (OOD) variations, highlighting DROID's efficacy in enhancing policy generalization.

Concluding Remarks and Future Directions

The introduction of DROID represents a milestone in robotic manipulation research, offering a dataset of unprecedented scale and diversity. Its successful application in improving policy performance and robustness signals the dataset's potential as a cornerstone for future research endeavors in the field. The detailed documentation, open-sourced data, and adaptable hardware platform accompanying DROID further its accessibility and utility to the broader research community. Future explorations could delve into optimizing the utilization of this diverse dataset, investigating novel learning paradigms, and expanding the dataset to encompass even wider scenarios and tasks.

The DROID dataset, through its comprehensive design and demonstrated utility, opens new avenues for the development of robust, generalizable robotic manipulation policies, marking a significant step forward in the quest towards versatile and adaptable robotic systems.