Argoverse 2: Next Generation Datasets for Self-Driving Perception and Forecasting (2301.00493v1)

Published 2 Jan 2023 in cs.CV, cs.AI, cs.LG, and cs.RO

Abstract: We introduce Argoverse 2 (AV2) - a collection of three datasets for perception and forecasting research in the self-driving domain. The annotated Sensor Dataset contains 1,000 sequences of multimodal data, encompassing high-resolution imagery from seven ring cameras, and two stereo cameras in addition to lidar point clouds, and 6-DOF map-aligned pose. Sequences contain 3D cuboid annotations for 26 object categories, all of which are sufficiently-sampled to support training and evaluation of 3D perception models. The Lidar Dataset contains 20,000 sequences of unlabeled lidar point clouds and map-aligned pose. This dataset is the largest ever collection of lidar sensor data and supports self-supervised learning and the emerging task of point cloud forecasting. Finally, the Motion Forecasting Dataset contains 250,000 scenarios mined for interesting and challenging interactions between the autonomous vehicle and other actors in each local scene. Models are tasked with the prediction of future motion for "scored actors" in each scenario and are provided with track histories that capture object location, heading, velocity, and category. In all three datasets, each scenario contains its own HD Map with 3D lane and crosswalk geometry - sourced from data captured in six distinct cities. We believe these datasets will support new and existing machine learning research problems in ways that existing datasets do not. All datasets are released under the CC BY-NC-SA 4.0 license.

Citations (492)

View on Semantic Scholar

Summary

The paper introduces Argoverse 2, a suite of sensor, lidar, and motion forecasting datasets that fill critical gaps in self-driving research.
Its sensor dataset offers 1,000 multimodal sequences with 3D annotations across six cities, enhancing object detection and tracking.
The lidar and motion forecasting datasets foster self-supervised learning and challenge models with complex, real-world urban scenarios.

Overview of Argoverse 2: Next Generation Datasets for Self-Driving Perception and Forecasting

The paper introduces Argoverse 2 (AV2), a collection of datasets designed to propel research in self-driving perception and forecasting. The dataset suite comprises the Sensor Dataset, the Lidar Dataset, and the Motion Forecasting Dataset, each curated to address specific gaps in existing resources and enhance machine learning model capabilities in the autonomous driving domain.

Sensor Dataset

The Sensor Dataset offers 1,000 15-second multimodal sequences, featuring high-resolution imagery, lidar point clouds, and calibrated vehicle poses. It contains 3D cuboid annotations for 26 object categories, making it highly suitable for 3D perception tasks. The dataset's diversity stems from data captured across six distinct cities, contributing to a robust training ground for models tasked with handling diverse and dynamic urban scenarios.

Lidar Dataset

Comprising 20,000 sequences of unlabeled lidar point clouds, the Lidar Dataset is the largest of its kind, supporting self-supervised learning and point cloud forecasting tasks. This dataset provides an expansive playground for developing lidar-based algorithms without the constraints of pre-defined labels, facilitating innovation in unsupervised learning methods and temporal point cloud prediction.

Motion Forecasting Dataset

With 250,000 scenarios, the Motion Forecasting Dataset emphasizes capturing complex interactions among diverse actors in urban settings. The dataset incorporates rich HD maps and diverse actor categories, challenging models to make accurate predictions about future motion in multifaceted environments. The experimental results indicate significant improvements over prior datasets, showcasing the challenging nature of these new scenarios.

Key Contributions and Implications

Taxonomy and Diversity: Argoverse 2 introduces the largest taxonomy to date, ensuring adequate representation of rare objects such as strollers and wheelchairs, which enhances the performance of models in edge cases. This emphasis on diversity improves the robustness and generalization capabilities of trained models.
Map Integration: The inclusion of high-definition maps enhances the datasets' value, providing essential priors in understanding road geometry and improving tasks such as object detection and motion forecasting through spatial awareness.
Self-supervision and Forecasting: By providing extensive unlabeled data, the Lidar Dataset encourages exploration into self-supervised techniques that can derive insights from raw sensor information, paving the way for advancements in learning representations directly from lidar point clouds.
Focus on Interesting Scenarios: By emphasizing complex and non-trivial driving scenarios, the datasets challenge current state-of-the-art methods and stimulate the development of novel approaches that perform well under increased task difficulty.

Future Directions in AI and Autonomous Driving

The introduction of Argoverse 2 sets the stage for several future research directions:

Advanced Self-supervised Learning: Leveraging the Lidar Dataset's vast quantity of unlabeled data to develop representations that can serve a wide array of tasks with minimal supervision.
Enhanced Model Robustness: Utilizing the datasets' diverse scenarios to improve model performance in rare and complex driving situations, ultimately increasing the safety and reliability of autonomous systems.
Improved Generalization: Focusing efforts on achieving model generalization across different geographical and environmental contexts, addressing a critical challenge in real-world deployments of autonomous vehicles.

In summary, Argoverse 2 provides a profound leap in resources available to the self-driving research community, facilitating advancement in key areas such as perception, forecasting, and self-supervised learning. The datasets' commitment to diversity, complexity, and self-supervision represents a pivotal step in the evolution of autonomous driving technologies.

PDF Markdown