One Million Scenes for Autonomous Driving: ONCE Dataset (2106.11037v3)

Published 21 Jun 2021 in cs.CV

Abstract: Current perception models in autonomous driving have become notorious for greatly relying on a mass of annotated data to cover unseen cases and address the long-tail problem. On the other hand, learning from unlabeled large-scale collected data and incrementally self-training powerful recognition models have received increasing attention and may become the solutions of next-generation industry-level powerful and robust perception models in autonomous driving. However, the research community generally suffered from data inadequacy of those essential real-world scene data, which hampers the future exploration of fully/semi/self-supervised methods for 3D perception. In this paper, we introduce the ONCE (One millioN sCenEs) dataset for 3D object detection in the autonomous driving scenario. The ONCE dataset consists of 1 million LiDAR scenes and 7 million corresponding camera images. The data is selected from 144 driving hours, which is 20x longer than the largest 3D autonomous driving dataset available (e.g. nuScenes and Waymo), and it is collected across a range of different areas, periods and weather conditions. To facilitate future research on exploiting unlabeled data for 3D detection, we additionally provide a benchmark in which we reproduce and evaluate a variety of self-supervised and semi-supervised methods on the ONCE dataset. We conduct extensive analyses on those methods and provide valuable observations on their performance related to the scale of used data. Data, code, and more information are available at https://once-for-auto-driving.github.io/index.html.

PDF Abstract

An Evaluation of the ONCE Dataset for Autonomous Driving

The paper entitled "One Million Scenes for Autonomous Driving: ONCE Dataset" presents an extensive and complex resource for advancing 3D object detection in autonomous driving systems. This dataset provides a substantial improvement over existing datasets in terms of the quantity and diversity of scene data it offers. It incorporates a wide range of geographical locations, weather conditions, and times of day, which is a significant contribution to the field. The ONCE dataset comprises 1 million LiDAR scenes and 7 million corresponding camera images, obtained from 144 driving hours—a twenty-fold increase in driving hours compared to existing 3D datasets such as nuScenes and Waymo Open Dataset.

Key Aspects of the ONCE Dataset

The ONCE dataset significantly contributes to the exploration of fully/supervised, semi-supervised, and self-supervised learning methods for 3D perception models. Noteworthy aspects include:

Scale and Diversity: The dataset is exceptionally large, capturing a vast array of outdoor scenes that include urban and rural areas, highways, and various weather conditions. Such diversity offers robust testing grounds for perception models aiming to achieve high accuracy in real-world scenarios.
Unlabeled Data Utilization: It addresses the challenge of training models with a limited amount of annotated data by offering unlabeled data crucial for self-supervised and semi-supervised learning paradigms.
Benchmarking Efforts: A benchmark is established to evaluate various self-supervised and semi-supervised learning algorithms on this dataset, providing a standardized comparison framework which is essential for consistent assessment of algorithmic improvements.

Evaluation of Detection Models

Various 3D object detection models were implemented and evaluated on the ONCE dataset, including point-based, voxel-based, pillar-based, and multi-modality fusion methods:

Voxel-based Methods: Among the models tested, voxel-based detectors such as SECOND demonstrate the ability to efficiently leverage the three-dimensional data structure, achieving better performance than point-based methods due to their capacity to aggregate features from spatial neighborhoods.
Center-based Assignments: Techniques, such as those employed by CenterPoints, improve detection for smaller objects by focusing on precise object localization, showcasing a significant enhancement in detecting pedestrians and cyclists.
Augmentation Techniques: The benchmark highlights that models using multi-modality data (point clouds and images) have not demonstrated superior performance without specific optimizations, likely due to the complexity of multi-sensor calibration and fusion requirements.

Self-Supervised and Semi-Supervised Learning

The paper evaluates several self-supervised learning methods, including contrastive learning and clustering-based techniques, emphasizing the impact of data availability:

Contrastive vs. Clustering: Clustering methods like DeepCluster and SwAV showed advantages over contrastive methods for learning useful representations in this context, likely due to the difficulties in obtaining representative views for contrastive learning in 3D autonomous modalities.
Unlabeled Data Potential: It is evident that increasing the amount of unlabeled data under the right methodological framework systematically enhances 3D detection capabilities, supporting the case for developing more robust unlabeled data strategies.

Implications and Future Work

The introduction of the ONCE dataset and its accompanying benchmark offers a new path for 3D object detection research. It also raises questions about the best strategies for model training and improvement in varied realistic scenarios. The results show promising directions for:

Algorithm Development: There is potential for the development and refinement of algorithms that can efficiently leverage large-scale unlabeled data.
Generalization Improvements: Enhancing model training approaches to better utilize diverse data can gain generalization improvements, which is critical as models move from controlled environments to broader real-world scenarios.
Integration and Fusion: Further studies could explore optimizing integration techniques for data from different sensor modalities to fully harness complementary information.

By addressing the limitations of current datasets in terms of scale and diversity, the ONCE dataset is positioned as a fundamental resource for ongoing and future research in autonomous driving technologies. Future developments might involve expanding annotations and supporting additional driving-related tasks, leading toward comprehensive datasets fueling the next generation of autonomous systems.

PDF Markdown Bookmark Chat (Pro)

Authors (13)

Jiageng Mao (20 papers)
Minzhe Niu (11 papers)
Chenhan Jiang (12 papers)
Hanxue Liang (13 papers)
Jingheng Chen (1 paper)
Xiaodan Liang (318 papers)
Yamin Li (17 papers)
Chaoqiang Ye (8 papers)
Wei Zhang (1489 papers)
Zhenguo Li (195 papers)
Jie Yu (98 papers)
Hang Xu (205 papers)
Chunjing Xu (66 papers)

Citations (235)

View on Semantic Scholar

Related Papers

Find Related Papers

GitHub

ONCE-Home