Hierarchical Multi-label Classification for Fine-level Event Extraction from Aviation Accident Reports (2403.17914v1)
Abstract: A large volume of accident reports is recorded in the aviation domain, which greatly values improving aviation safety. To better use those reports, we need to understand the most important events or impact factors according to the accident reports. However, the increasing number of accident reports requires large efforts from domain experts to label those reports. In order to make the labeling process more efficient, many researchers have started developing algorithms to identify the underlying events from accident reports automatically. This article argues that we can identify the events more accurately by leveraging the event taxonomy. More specifically, we consider the problem a hierarchical classification task where we first identify the coarse-level information and then predict the fine-level information. We achieve this hierarchical classification process by incorporating a novel hierarchical attention module into BERT. To further utilize the information from event taxonomy, we regularize the proposed model according to the relationship and distribution among labels. The effectiveness of our framework is evaluated with the data collected by National Transportation Safety Board (NTSB). It has been shown that fine-level prediction accuracy is highly improved, and the regularization term can be beneficial to the rare event identification problem.
- Geng X (2016) Label distribution learning. IEEE Transactions on Knowledge and Data Engineering 28(7):1734–1748.
- NTSB (2008) Aviation coding manual. https://www.ntsb.gov/GILS/Documents/codman.pdf.
- Rao AH, Marais K (2020) A state-based approach to modeling general aviation accidents. Reliability Engineering & System Safety 193:106670.
- Rath S, Chow JY (2022) Worldwide city transport typology prediction with sentence-bert based supervised learning via wikipedia. Transportation Research Part C: Emerging Technologies 139:103661.
- Robinson SD (2018) Multi-label classification of contributing causal factors in self-reported safety narratives. Safety 4(3):30.
- Silla CN, Freitas AA (2011) A survey of hierarchical classification across different application domains. Data Mining and Knowledge Discovery 22(1):31–72.
- Yao W, Qian S (2021) From twitter to traffic predictor: Next-day morning traffic prediction using social media data. Transportation research part C: emerging technologies 124:102938.
- Zhang X, Mahadevan S (2020) Bayesian neural networks for flight trajectory prediction and safety assessment. Decision Support Systems 131:113246.