Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Scalability in Perception for Autonomous Driving: Waymo Open Dataset (1912.04838v7)

Published 10 Dec 2019 in cs.CV, cs.LG, and stat.ML

Abstract: The research community has increasing interest in autonomous driving research, despite the resource intensity of obtaining representative real world data. Existing self-driving datasets are limited in the scale and variation of the environments they capture, even though generalization within and between operating regions is crucial to the overall viability of the technology. In an effort to help align the research community's contributions with real-world self-driving problems, we introduce a new large scale, high quality, diverse dataset. Our new dataset consists of 1150 scenes that each span 20 seconds, consisting of well synchronized and calibrated high quality LiDAR and camera data captured across a range of urban and suburban geographies. It is 15x more diverse than the largest camera+LiDAR dataset available based on our proposed diversity metric. We exhaustively annotated this data with 2D (camera image) and 3D (LiDAR) bounding boxes, with consistent identifiers across frames. Finally, we provide strong baselines for 2D as well as 3D detection and tracking tasks. We further study the effects of dataset size and generalization across geographies on 3D detection methods. Find data, code and more up-to-date information at http://www.waymo.com/open.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (25)
  1. Pei Sun (49 papers)
  2. Henrik Kretzschmar (12 papers)
  3. Xerxes Dotiwalla (4 papers)
  4. Aurelien Chouard (2 papers)
  5. Vijaysai Patnaik (1 paper)
  6. Paul Tsui (1 paper)
  7. James Guo (3 papers)
  8. Yin Zhou (32 papers)
  9. Yuning Chai (25 papers)
  10. Benjamin Caine (10 papers)
  11. Vijay Vasudevan (24 papers)
  12. Wei Han (202 papers)
  13. Jiquan Ngiam (17 papers)
  14. Hang Zhao (156 papers)
  15. Aleksei Timofeev (7 papers)
  16. Scott Ettinger (10 papers)
  17. Maxim Krivokon (1 paper)
  18. Amy Gao (2 papers)
  19. Aditya Joshi (43 papers)
  20. Sheng Zhao (75 papers)
Citations (2,432)

Summary

Scalability in Perception for Autonomous Driving: Waymo Open Dataset

The paper "Scalability in Perception for Autonomous Driving: Waymo Open Dataset" addresses the critical challenge of data scalability in autonomous driving research. The authors introduce a large-scale, high-quality, multimodal dataset designed to facilitate advanced research in perception for self-driving vehicles.

Dataset Composition and Features

The Waymo Open Dataset consists of a comprehensive collection of sensor data captured from autonomous vehicles operating across various geographies, including urban and suburban areas. The dataset covers 1,150 scenes, each spanning 20 seconds, with data from high-resolution cameras and LiDAR sensors. It is touted to be 15 times more diverse than existing datasets based on the authors' geographical coverage metric.

The dataset includes:

  • 12 million 3D LiDAR box annotations
  • 12 million 2D camera box annotations
  • 113k unique LiDAR object tracks
  • 250k camera image tracks

Annotations are exhaustively reviewed, ensuring high accuracy, and they cover vehicles, pedestrians, signs, and cyclists. The synchronization between LiDAR and camera data is meticulously maintained, offering researchers a robust foundation for developing sensor fusion algorithms.

Methodological Contributions

The paper provides strong baselines for 2D and 3D object detection and tracking tasks, which are crucial for developing real-world autonomous driving systems. Using state-of-the-art methods like PointPillars for 3D LiDAR-based detection, the authors achieve significant benchmark results:

  • 3D Vehicle Detection APH: 62.8 (LEVEL_1)
  • 3D Pedestrian Detection APH: 50.2 (LEVEL_1)
  • 3D Vehicle Tracking MOTA: 42.5 (LEVEL_1)
  • 3D Pedestrian Tracking MOTA: 38.9 (LEVEL_1)

These benchmarks provide a reference point for future research and help evaluate the efficacy of novel approaches in a controlled environment.

Domain Adaptation and Dataset Diversity

A notable aspect of the dataset is its geographical diversity, with data collected from multiple cities, including San Francisco, Phoenix, and Mountain View. This diversity introduces a domain gap, offering opportunities for research in domain adaptation. Preliminary experiments show pronounced performance differences when models trained on one city are tested in another, underlining the dataset's potential to drive advancements in this area.

For instance, training on San Francisco data and testing on suburban regions resulted in an 8.0 reduction in 3D Vehicle Detection APH, indicating substantial domain shift and the necessity for robust adaptation techniques.

Implications and Future Directions

The Waymo Open Dataset sets a new benchmark in scale and quality for autonomous driving research. Its extensive annotations and synchronized multimodal sensor data enable the development and testing of advanced perception algorithms. The dataset's diversity also paves the way for research into domain adaptation, a critical challenge in deploying autonomous systems in varied environments.

Looking forward, the dataset's potential expansions could include map information, more diverse geographical and temporal data, and conditions-specific scenarios such as different weather conditions. These additions would enable research into not only perception but also behavior prediction, planning, and more sophisticated domain adaptation strategies.

Conclusion

The Waymo Open Dataset is a significant contribution to the autonomous driving research community, providing a rich and diverse resource for developing and benchmarking perception algorithms. It addresses the scalability challenge by offering extensive, high-quality data, and opens up new avenues for research in domain adaptation and sensory fusion. The dataset's impact is expected to accelerate progress toward robust and generalizable autonomous driving systems.

The dataset and associated code are publicly available, and the authors plan to maintain a public leaderboard to track advancements in the field. This commitment to open science is conducive to collaborative progress and continuous improvement in autonomous vehicle technology.