Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
120 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Argoverse: 3D Tracking and Forecasting with Rich Maps (1911.02620v1)

Published 6 Nov 2019 in cs.CV and cs.RO

Abstract: We present Argoverse -- two datasets designed to support autonomous vehicle machine learning tasks such as 3D tracking and motion forecasting. Argoverse was collected by a fleet of autonomous vehicles in Pittsburgh and Miami. The Argoverse 3D Tracking dataset includes 360 degree images from 7 cameras with overlapping fields of view, 3D point clouds from long range LiDAR, 6-DOF pose, and 3D track annotations. Notably, it is the only modern AV dataset that provides forward-facing stereo imagery. The Argoverse Motion Forecasting dataset includes more than 300,000 5-second tracked scenarios with a particular vehicle identified for trajectory forecasting. Argoverse is the first autonomous vehicle dataset to include "HD maps" with 290 km of mapped lanes with geometric and semantic metadata. All data is released under a Creative Commons license at www.argoverse.org. In our baseline experiments, we illustrate how detailed map information such as lane direction, driveable area, and ground height improves the accuracy of 3D object tracking and motion forecasting. Our tracking and forecasting experiments represent only an initial exploration of the use of rich maps in robotic perception. We hope that Argoverse will enable the research community to explore these problems in greater depth.

Citations (1,174)

Summary

  • The paper’s main contribution is the introduction of two novel datasets that leverage rich HD maps to enhance 3D tracking and trajectory forecasting.
  • It details the integration of multi-camera imagery, LiDAR, and semantic map data, achieving improved tracking metrics like MOTA and MOTP.
  • Experiments demonstrate that incorporating map-based orientation and lane information significantly boosts forecasting accuracy with lower minADE and minFDE.

An Expert Review of "Argoverse: 3D Tracking and Forecasting with Rich Maps"

The paper "Argoverse: 3D Tracking and Forecasting with Rich Maps" presents a significant contribution to the domain of autonomous vehicle (AV) research. Authored by researchers from Argo AI, Carnegie Mellon University, and the Georgia Institute of Technology, this work introduces two datasets aimed at enhancing machine learning tasks in autonomous driving, specifically focusing on 3D tracking and motion forecasting.

Overview of Argoverse Datasets

The Argoverse initiative offers two distinct datasets: the Argoverse 3D Tracking dataset and the Argoverse Motion Forecasting dataset. These datasets were collected by a fleet of AVs operating in Pittsburgh and Miami, offering a diverse set of real-world conditions.

Argoverse 3D Tracking Dataset

This dataset includes 360-degree images from seven cameras with overlapping fields of view, long-range LiDAR point clouds, 6-DOF (Degrees of Freedom) pose information, and annotated 3D object tracks. Notably, it is the first modern AV dataset to incorporate forward-facing stereo imagery. The dataset is equipped with detailed "HD maps" that include lane centerlines, driveable regions, and ground height data.

Argoverse Motion Forecasting Dataset

Complementing the tracking dataset, this dataset contains more than 300,000 five-second tracked scenarios, focusing on the trajectory prediction of a specific vehicle. Rich map information is again included, featuring 290 km of mapped lanes with geometric and semantic metadata, making it a valuable resource for trajectory forecasting.

Methodology and Experiments

The paper's experiments aim to demonstrate the value of detailed map information in improving 3D object tracking and motion forecasting accuracy. The following key aspects were evaluated:

  1. Effect of Map Information on 3D Tracking:
    • Map-based ground removal and orientation alignment with lane direction were found to improve tracking performance.
    • The utilization of semantic map data yielded higher accuracy in object tracking.
  2. Motion Forecasting with Rich Maps:
    • By leveraging map data, the researchers demonstrated improved accuracy in trajectory forecasting.
    • Diverse predictions generated from lane graphs and map-based pruning of predictions enhanced the robustness of the forecasting models.

Evaluation Metrics

The evaluation of 3D tracking utilized standard metrics like MOTA (Multi-Object Tracking Accuracy) and MOTP (Multi-Object Tracking Precision), while motion forecasting leveraged metrics such as minimum Average Displacement Error (minADE) and minimum Final Displacement Error (minFDE).

Key Numerical Results

  • The dataset provides track annotations for 15 object classes and 300,000+ sequences for motion forecasting.
  • It boasted a significantly higher number of tracked objects than the KITTI dataset.
  • The inclusiveness of HD maps allowed the driveable area information to cover over 1,192,073 m2\text{m}^2.

Implications and Future Work

The key contribution lies in the rich map data which facilitates higher accuracy in both 3D tracking and motion forecasting tasks. This advancement is pivotal as the field moves towards more complex and realistic autonomous driving scenarios. The paper sets a strong precedent for future works to incorporate detailed map information. Potential future directions could include leveraging larger datasets, improving map automation techniques, and integrating multi-modal sensory data more seamlessly.

Conclusion

"Argoverse: 3D Tracking and Forecasting with Rich Maps" provides a fundamental step forward in the AV research community by releasing comprehensive datasets enriched with HD maps. The detailed mapping information and the large-scale datasets support the development of more accurate and robust AV systems. These datasets and their accompanying baseline experiments lay the groundwork for future research to explore the utilization of map data in enhancing AV perception and navigation capabilities.