TartanAviation: Image, Speech, and ADS-B Trajectory Datasets for Terminal Airspace Operations (2403.03372v1)
Abstract: We introduce TartanAviation, an open-source multi-modal dataset focused on terminal-area airspace operations. TartanAviation provides a holistic view of the airport environment by concurrently collecting image, speech, and ADS-B trajectory data using setups installed inside airport boundaries. The datasets were collected at both towered and non-towered airfields across multiple months to capture diversity in aircraft operations, seasons, aircraft types, and weather conditions. In total, TartanAviation provides 3.1M images, 3374 hours of Air Traffic Control speech data, and 661 days of ADS-B trajectory data. The data was filtered, processed, and validated to create a curated dataset. In addition to the dataset, we also open-source the code-base used to collect and pre-process the dataset, further enhancing accessibility and usability. We believe this dataset has many potential use cases and would be particularly vital in allowing AI and machine learning technologies to be integrated into air traffic control systems and advance the adoption of autonomous aircraft in the airspace.
- Administration, F. A. Faa aerospace forecast—fiscal years 2023-2043 (2023).
- Enabling airspace integration for high-density on-demand mobility operations. In 17th AIAA Aviation Technology, Integration, and Operations Conference, 3086 (2017).
- Association, I. A. T. Air passenger market analysis - december 2022. Tech. Rep., IATA (2022).
- Lin, Y. et al. Towards recognition for radio-echo speech in air traffic control: Dataset and a contrastive learning approach. \JournalTitleIEEE/ACM Transactions on Audio, Speech, and Language Processing (2023).
- Zuluaga-Gomez, J. et al. Atco2 corpus: A large-scale dataset for research on automatic speech recognition and natural language understanding of air traffic control communications. \JournalTitlearXiv preprint arXiv:2211.04054 (2022).
- Avoidds: Aircraft vision-based intruder detection dataset and simulator. \JournalTitlearXiv preprint arXiv:2306.11203 (2023).
- AICrowd. Airborne object tracking challenge (2022).
- Safety considerations for operation of different classes of uavs in the nas. In AIAA 4th Aviation Technology, Integration and Operations (ATIO) Forum, 6244 (2004).
- Airtrack: Onboard deep learning framework for long-range aircraft detection and tracking, 10.48550/ARXIV.2209.12849 (2022).
- General aviation aircraft identification at non-towered airports using a two-step computer vision-based approach. \JournalTitleIEEE Access 10, 48778–48791 (2022).
- A dataset of stationary, fixed-wing aircraft on a collision course for vision-based sense and avoid. In 2022 International Conference on Unmanned Aircraft Systems (ICUAS), 144–149 (IEEE, 2022).
- Predicting like a pilot: Dataset and method to predict socially-aware aircraft trajectories in non-towered terminal airspace. \JournalTitlearXiv preprint arXiv:2109.15158 (2021).
- Godfrey, J. J. Air traffic control complete. \JournalTitleLinguistic Data Consortium, Philadelphia, USA http://www. ldc. upenn. edu/Catalog/CatalogEntry. jsp (1994).
- Guo, D. et al. M2ats: A real-world multimodal air traffic situation benchmark dataset and beyond. In Proceedings of the 31st ACM International Conference on Multimedia, 213–221 (2023).
- Srinivasamurthy, A. et al. Semi-supervised learning with semantic knowledge extraction for improved speech recognition in air traffic control. Tech. Rep. (2017).
- Iowa environmental mesonet. \JournalTitleAvailable at mesonet. agron. iastate. edu/request/coop/fe. phtml (verified 27 Sept. 2005). Iowa State Univ., Dep. of Agron., Ames, IA (2004).
- On minimum audible sound fields, 10.1121/1.1915608 (1933).
- Factors governing the intelligibility of speech sounds, 10.1121/1.1916407 (1947).