Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 134 tok/s
Gemini 2.5 Pro 41 tok/s Pro
GPT-5 Medium 27 tok/s Pro
GPT-5 High 26 tok/s Pro
GPT-4o 77 tok/s Pro
Kimi K2 200 tok/s Pro
GPT OSS 120B 427 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

Improving Traffic Signal Data Quality for the Waymo Open Motion Dataset (2506.07150v1)

Published 8 Jun 2025 in cs.RO

Abstract: Datasets pertaining to autonomous vehicles (AVs) hold significant promise for a range of research fields, including AI, autonomous driving, and transportation engineering. Nonetheless, these datasets often encounter challenges related to the states of traffic signals, such as missing or inaccurate data. Such issues can compromise the reliability of the datasets and adversely affect the performance of models developed using them. This research introduces a fully automated approach designed to tackle these issues by utilizing available vehicle trajectory data alongside knowledge from the transportation domain to effectively impute and rectify traffic signal information within the Waymo Open Motion Dataset (WOMD). The proposed method is robust and flexible, capable of handling diverse intersection geometries and traffic signal configurations in real-world scenarios. Comprehensive validations have been conducted on the entire WOMD, focusing on over 360,000 relevant scenarios involving traffic signals, out of a total of 530,000 real-world driving scenarios. In the original dataset, 71.7% of traffic signal states are either missing or unknown, all of which were successfully imputed by our proposed method. Furthermore, in the absence of ground-truth signal states, the accuracy of our approach is evaluated based on the rate of red-light violations among vehicle trajectories. Results show that our method reduces the estimated red-light running rate from 15.7% in the original data to 2.9%, thereby demonstrating its efficacy in rectifying data inaccuracies. This paper significantly enhances the quality of AV datasets, contributing to the wider AI and AV research communities and benefiting various downstream applications. The code and improved traffic signal data are open-sourced at https://github.com/michigan-traffic-lab/WOMD-Traffic-Signal-Data-Improvement

Summary

  • The paper presents an automated method that infers accurate traffic signal states using vehicle trajectory data and domain-specific configurations.
  • It successfully imputes missing or unknown signals in 71.7% of cases while reducing red-light running estimates from 15.7% to 2.9%.
  • The improvement facilitates safer autonomous vehicle operation by providing cleaner data for trajectory prediction and decision-making.

Improving Traffic Signal Data for Autonomous Vehicle Datasets

The paper "Improving Traffic Signal Data Quality for the Waymo Open Motion Dataset" introduces a comprehensive method to address the challenges encountered with the traffic signal information in autonomous vehicle datasets, specifically the Waymo Open Motion Dataset (WOMD). The research identifies the common issues within such datasets—unknown, missing, and inaccurate traffic signal states—and presents a fully automated method for imputation and correction using vehicle trajectory data and domain-specific knowledge. This method is notable for its adaptability to diverse intersection geometries and traffic signal configurations, thus enhancing the quality of data crucial for various downstream applications.

Methodology Overview

The approach leverages vehicle trajectory data and utilizes transportation domain knowledge, specifically focusing on the prevalent Ring-and-Barrier diagram structure in traffic signal design, to infer accurate traffic signal states. The approach is twofold: first, it estimates a preliminary traffic signal state using vehicle trajectory data, and second, it refines these estimates by considering feasible configurations based on real-world traffic signal timing practices. The iterative method ensures that the final traffic signal states are consistent with both the vehicle trajectories observed and the known structural constraints of traffic signal configurations.

Numerical Results and Analysis

In a comprehensive validation, the proposed methodology was applied to the entire WOMD dataset, which included over 360,000 relevant scenarios featuring signalized intersections. The paper reported significant improvements, with traffic signal states successfully imputed for 71.7%71.7\% of instances that were either missing or unknown. Furthermore, the methodology drastically reduced the estimated red-light running rate from 15.7%15.7\% to 2.9%2.9\% as compared to the original dataset, demonstrating its efficacy in rectifying data inaccuracies.

Implications and Future Directions

By significantly improving the quality of traffic signal information, this research aids the development of more reliable AI models for autonomous driving applications. It bridges a critical gap in AV datasets, which are foundational for trajectory prediction, AV decision-making, and human behavior modeling. The improvements facilitate better simulation and testing environments for autonomous vehicles, leading to safer and more reliable systems.

However, the methodology is primarily validated on standard intersection configurations, and its performance in irregular intersections with atypical geometries may require further refinement. Future research could focus on enhancing the robustness of the approach for nonstandard intersection types and exploring the integration of additional data sources to improve signal state estimation further.

In conclusion, this paper contributes significantly to dataset quality enhancement, presenting a practical tool that researchers and developers can utilize to refine traffic signal information in large-scale autonomous vehicle datasets. This effort not only supports advancements in AI applications but also underscores the collaborative potential between transportation engineering and artificial intelligence research.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.