Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Large Scale Interactive Motion Forecasting for Autonomous Driving : The Waymo Open Motion Dataset (2104.10133v1)

Published 20 Apr 2021 in cs.CV, cs.LG, and cs.RO

Abstract: As autonomous driving systems mature, motion forecasting has received increasing attention as a critical requirement for planning. Of particular importance are interactive situations such as merges, unprotected turns, etc., where predicting individual object motion is not sufficient. Joint predictions of multiple objects are required for effective route planning. There has been a critical need for high-quality motion data that is rich in both interactions and annotation to develop motion planning models. In this work, we introduce the most diverse interactive motion dataset to our knowledge, and provide specific labels for interacting objects suitable for developing joint prediction models. With over 100,000 scenes, each 20 seconds long at 10 Hz, our new dataset contains more than 570 hours of unique data over 1750 km of roadways. It was collected by mining for interesting interactions between vehicles, pedestrians, and cyclists across six cities within the United States. We use a high-accuracy 3D auto-labeling system to generate high quality 3D bounding boxes for each road agent, and provide corresponding high definition 3D maps for each scene. Furthermore, we introduce a new set of metrics that provides a comprehensive evaluation of both single agent and joint agent interaction motion forecasting models. Finally, we provide strong baseline models for individual-agent prediction and joint-prediction. We hope that this new large-scale interactive motion dataset will provide new opportunities for advancing motion forecasting models.

Citations (469)

Summary

  • The paper introduces a large-scale dataset with over 100,000 curated scenes, emphasizing interactive scenarios critical for autonomous driving.
  • It employs a state-of-the-art 3D auto-labeling system to generate detailed 3D bounding boxes and HD maps across diverse urban settings.
  • Baseline models using LSTM architectures with enhanced contextual features show significant improvements in predictive accuracy.

Overview of Large Scale Interactive Motion Forecasting for Autonomous Driving

This paper presents a comprehensive paper on motion forecasting for autonomous driving, emphasizing the development of a large-scale interactive motion dataset. As autonomous driving systems require accurate motion forecasting, especially in interactive situations such as merges and turns, the authors introduce a dataset comprising over 100,000 scenes spanning 570 hours of driving data. This dataset is noteworthy for its diversity, richness in interaction, and annotations, which are critical for developing sophisticated motion planning models.

Dataset Characteristics

The dataset offers high-quality, auto-labeled 3D data crucial for capturing the complexity of real-world driving dynamics. It spans multiple cities, providing a varied landscape of road geometries and diverse interactions among vehicles, pedestrians, and cyclists. The authors employ a state-of-the-art 3D auto-labeling system to ensure accuracy, providing 3D bounding boxes and HD maps without the substantial costs associated with manual labeling.

Metrics and Evaluation

The authors propose new metrics to evaluate both single-agent and multi-agent motion forecasting models. These metrics include minADE, minFDE, and a novel mean Average Precision (mAP) specifically adapted to capture joint prediction performance across different trajectory shapes and prediction time scales. This approach allows for a more nuanced analysis of predictive accuracy and interaction modeling.

Baseline Models and Results

Several baseline models are introduced, utilizing LSTM architectures with enhancements like road graph information and traffic signal states. These models highlight the benefits of incorporating rich contextual data in improving prediction accuracy. Results demonstrate significant gains in predictive performance when these additional features are integrated.

Implications and Future Directions

The release of this dataset represents a substantial contribution to the field of autonomous driving, offering a wealth of data for developing and testing advanced motion forecasting models. The dataset's richness in interactive scenarios is particularly valuable, given the complexity of modeling such environments. Future research can leverage this resource to enhance interaction modeling, potentially improving safety and efficiency in autonomous driving systems.

The authors' approach to joint prediction modeling marks a significant step in understanding multi-agent dynamics, a crucial aspect as real-world deployments of autonomous systems increase. Continued exploration in this area could lead to more sophisticated algorithms capable of handling nuanced interactions, thereby advancing the state-of-the-art in autonomous driving technology.