Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
133 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

TartanAviation: Image, Speech, and ADS-B Trajectory Datasets for Terminal Airspace Operations (2403.03372v1)

Published 5 Mar 2024 in cs.LG

Abstract: We introduce TartanAviation, an open-source multi-modal dataset focused on terminal-area airspace operations. TartanAviation provides a holistic view of the airport environment by concurrently collecting image, speech, and ADS-B trajectory data using setups installed inside airport boundaries. The datasets were collected at both towered and non-towered airfields across multiple months to capture diversity in aircraft operations, seasons, aircraft types, and weather conditions. In total, TartanAviation provides 3.1M images, 3374 hours of Air Traffic Control speech data, and 661 days of ADS-B trajectory data. The data was filtered, processed, and validated to create a curated dataset. In addition to the dataset, we also open-source the code-base used to collect and pre-process the dataset, further enhancing accessibility and usability. We believe this dataset has many potential use cases and would be particularly vital in allowing AI and machine learning technologies to be integrated into air traffic control systems and advance the adoption of autonomous aircraft in the airspace.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (18)
  1. Administration, F. A. Faa aerospace forecast—fiscal years 2023-2043 (2023).
  2. Enabling airspace integration for high-density on-demand mobility operations. In 17th AIAA Aviation Technology, Integration, and Operations Conference, 3086 (2017).
  3. Association, I. A. T. Air passenger market analysis - december 2022. Tech. Rep., IATA (2022).
  4. Lin, Y. et al. Towards recognition for radio-echo speech in air traffic control: Dataset and a contrastive learning approach. \JournalTitleIEEE/ACM Transactions on Audio, Speech, and Language Processing (2023).
  5. Zuluaga-Gomez, J. et al. Atco2 corpus: A large-scale dataset for research on automatic speech recognition and natural language understanding of air traffic control communications. \JournalTitlearXiv preprint arXiv:2211.04054 (2022).
  6. Avoidds: Aircraft vision-based intruder detection dataset and simulator. \JournalTitlearXiv preprint arXiv:2306.11203 (2023).
  7. AICrowd. Airborne object tracking challenge (2022).
  8. Safety considerations for operation of different classes of uavs in the nas. In AIAA 4th Aviation Technology, Integration and Operations (ATIO) Forum, 6244 (2004).
  9. Airtrack: Onboard deep learning framework for long-range aircraft detection and tracking, 10.48550/ARXIV.2209.12849 (2022).
  10. General aviation aircraft identification at non-towered airports using a two-step computer vision-based approach. \JournalTitleIEEE Access 10, 48778–48791 (2022).
  11. A dataset of stationary, fixed-wing aircraft on a collision course for vision-based sense and avoid. In 2022 International Conference on Unmanned Aircraft Systems (ICUAS), 144–149 (IEEE, 2022).
  12. Predicting like a pilot: Dataset and method to predict socially-aware aircraft trajectories in non-towered terminal airspace. \JournalTitlearXiv preprint arXiv:2109.15158 (2021).
  13. Godfrey, J. J. Air traffic control complete. \JournalTitleLinguistic Data Consortium, Philadelphia, USA http://www. ldc. upenn. edu/Catalog/CatalogEntry. jsp (1994).
  14. Guo, D. et al. M2ats: A real-world multimodal air traffic situation benchmark dataset and beyond. In Proceedings of the 31st ACM International Conference on Multimedia, 213–221 (2023).
  15. Srinivasamurthy, A. et al. Semi-supervised learning with semantic knowledge extraction for improved speech recognition in air traffic control. Tech. Rep. (2017).
  16. Iowa environmental mesonet. \JournalTitleAvailable at mesonet. agron. iastate. edu/request/coop/fe. phtml (verified 27 Sept. 2005). Iowa State Univ., Dep. of Agron., Ames, IA (2004).
  17. On minimum audible sound fields, 10.1121/1.1915608 (1933).
  18. Factors governing the intelligibility of speech sounds, 10.1121/1.1916407 (1947).
Citations (1)

Summary

  • The paper introduces a novel open-source multimodal dataset integrating image, speech, and ADS-B trajectory data for enhanced terminal airspace operations.
  • It details rigorous data collection and preprocessing methodologies across diverse airport environments and weather conditions.
  • The dataset enables advancements in computer vision, time-series analysis, and speech-to-text systems for improved air traffic control communications.

TartanAviation: Comprehensive Multimodal Dataset for Enhancing Terminal Airspace Operations through AI

Introduction

The ever-increasing demand for air travel and the imminent integration of Advanced Aerial Mobility (AAM) into the National Airspace System underscores a critical need for advancements in air traffic control systems. Addressing this need, TartanAviation emerges as a novel open-source multimodal dataset aimed at fostering innovations in terminal airspace operations. This dataset offers an unparalleled perspective of the airport environment by incorporating concurrent collections of image, speech, and ADS-B trajectory data within airport boundaries. TartanAviation’s encompassing approach not only facilitates the development of AI-driven technologies for air traffic management but also aligns with the broader objective of integrating autonomous aircraft into the airspace.

Dataset Overview

Collected across towered and non-towered airfields within the US, TartanAviation provides a rich tapestry of data reflecting diverse aircraft operations, seasons, aircraft types, and weather conditions. The dataset encompasses 3.1M images, 3374 hours of air traffic control speech data, and 661 days of ADS-B trajectory data. It's a holistic resource created with rigorous filtering, processing, and validation methodologies. Moreover, the open-sourcing of the collection and preprocessing code-base significantly enhances the dataset’s accessibility and usability.

Multimodality at its Core

Vision Data

TartanAviation’s vision data, collected using an array of Sony IMX 264 cameras, portrays a wide array of scenarios including adverse weather conditions, providing over 700k aircraft labels. This real-world large-scale dataset is essential for developing robust computer vision techniques aimed at long-range object detection, crucial for aviation safety through visual detect-and-avoid (DAA) systems.

Trajectory Data

The trajectory component of TartanAviation is an extensive collection of time-series information depicting aircraft movements within terminal airspaces. It extends prior work by offering 661 days of data from both towered and non-towered airports, enabling research not only in aviation but also in broader areas such as time-series forecasting, and anomaly detection.

Speech Data

Unique to TartanAviation is its inclusion of air traffic control speech data from smaller airports, offering both towered and non-towered fields. This first-of-its-kind speech data, complemented by concurrent trajectory information, opens avenues for multi-modal speech-to-text translation and intent prediction research, tailored to the context of air traffic control communications.

Implications and Future Directions

TartanAviation stands as a testament to the growing intersection of AI and aviation, particularly in the domain of air traffic management. By offering a dataset that conjoins images, speech, and trajectory data collected in the complex environment of terminal airspace, it sets the stage for the development of AI solutions capable of enhancing traffic management efficiency and safety. The dataset supports a broad spectrum of applications from vision-based object detection and trajectory prediction to speech understanding and intent prediction in air traffic control communications. TartanAviation not only challenges existing methodologies but also encourages the exploration of multi-modal data utilisation in aviation, paving the way for advancements in both theoretical and practical aspects of AI in aviation.

Accessibility and Utilization

TartanAviation’s structured dataset, accompanied by the deployment of dataloaders and preprocessing utilities, ensures seamless integration with existing technology stacks. The provision of data in common formats encourages immediate adoption within the researcher community, facilitating rapid experimentation and iteration. This accessibility, combined with the dataset’s comprehensive nature, promises significant contributions to enhancing terminal airspace operations through AI-driven solutions.

Concluding Remarks

TartanAviation represents a significant stride towards realizing the potential of AI in revolutionizing terminal airspace operations. By providing a rich, multimodal dataset, the research community is equipped with the tools necessary to drive innovations that can shape the future of air travel and air traffic management. As the dataset continues to evolve, it will likely become an invaluable resource for researchers and practitioners alike, fostering a new era of AI-enabled advancements in aviation.

X Twitter Logo Streamline Icon: https://streamlinehq.com