CMNEE: A Large-Scale Document-Level Event Extraction Dataset based on Open-Source Chinese Military News (2404.12242v1)
Abstract: Extracting structured event knowledge, including event triggers and corresponding arguments, from military texts is fundamental to many applications, such as intelligence analysis and decision assistance. However, event extraction in the military field faces the data scarcity problem, which impedes the research of event extraction models in this domain. To alleviate this problem, we propose CMNEE, a large-scale, document-level open-source Chinese Military News Event Extraction dataset. It contains 17,000 documents and 29,223 events, which are all manually annotated based on a pre-defined schema for the military domain including 8 event types and 11 argument role types. We designed a two-stage, multi-turns annotation strategy to ensure the quality of CMNEE and reproduced several state-of-the-art event extraction models with a systematic evaluation. The experimental results on CMNEE fall shorter than those on other domain datasets obviously, which demonstrates that event extraction for military domain poses unique challenges and requires further research efforts. Our code and data can be obtained from https://github.com/Mzzzhu/CMNEE.
- David Ahn. 2006. The stages of event extraction. In Proceedings of the Workshop on Annotating and Reasoning about Time and Events, pages 1–8, Sydney, Australia. Association for Computational Linguistics.
- Martin Bang. 2016. Pitfalls in military quantitative intelligence analysis: Incident reporting in a low intensity conflict. Intelligence and National Security, pages 49–73.
- OneEE: A one-stage framework for fast overlapping and nested event extraction. In Proceedings of the 29th International Conference on Computational Linguistics, pages 1953–1964, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.
- Ace 2005 multilingual training corpus. https://catalog.ldc.upenn.edu/LDC2006T06.
- Chinese document-level emergency event extraction dataset and corresponding methods.
- Is GPT-3 a good data annotator? CoRR, abs/2212.10450.
- Xinya Du and Claire Cardie. 2020. Event extraction by answering (almost) natural questions. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 671–683, Online. Association for Computational Linguistics.
- Multi-sentence argument linking. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 8057–8077, Online. Association for Computational Linguistics.
- Lawrence Freedman. 1983. The soviet estimate: Us intelligence analysis and russian military strength. International Affairs, pages 261–262.
- A study of chinese event taggability. In 2010 Second International Conference on Communication Software and Networks, pages 400–404.
- Hongbo Gao. 2021. Military Event Extraction for Encyclopedia Data. Ph.D. thesis, Dalian University of Technology.
- Llms accelerate annotation for medical information extraction.
- Ralph Grishman and Beth Sundheim. 1996. Message Understanding Conference- 6: A brief history. In COLING 1996 Volume 1: The 16th International Conference on Computational Linguistics.
- Duee-fin: A large-scale dataset for document-level event extraction. In Natural Language Processing and Chinese Computing, pages 172–183, Cham. Springer International Publishing.
- Event extraction as natural language generation. CoRR, abs/2108.12724.
- Event extraction with dynamic prefix tuning and relevance retrieval. IEEE Transactions on Knowledge and Data Engineering, 35(10):9946–9958.
- "a dataset of domain events based on open-source military news". Science Data Bank, pages 30(1–10).
- Lucian-Marius Ivanov. 2011. Management of intelligence activity improving the intelligence analysis process as a pre-requisite to increasing the efficiency of military intelligence services. Revista Academiei Fortelor Terestre, pages 20–28.
- Document-level event argument extraction by conditional generation. CoRR, abs/2104.05919.
- Document-level event argument extraction by conditional generation. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 894–908, Online. Association for Computational Linguistics.
- Duee: A large-scale dataset for chinese event extraction in real-world scenarios. In Natural Language Processing and Chinese Computing, pages 534–545, Cham. Springer International Publishing.
- Event extraction as machine reading comprehension. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1641–1651, Online. Association for Computational Linguistics.
- Event extraction as question generation and answering.
- Text2event: Controllable sequence-to-structure generation for end-to-end event extraction. CoRR, abs/2106.09232.
- Hui Li;Guoyu Yan;Xin Zhao;Jie Zhang;Ming Lyu. 2022. Tactical mission event logic graph construction for network-centric warfare. Alexandria Engineering Journal, pages 9161–9173.
- Prompt for extraction? PAIE: Prompting argument interaction for event argument extraction. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 6759–6774, Dublin, Ireland. Association for Computational Linguistics.
- Omnievent: A comprehensive, fair, and easy-to-use toolkit for event understanding.
- Overview of the cancer genetics (CG) task of BioNLP shared task 2013. In Proceedings of the BioNLP Shared Task 2013 Workshop, pages 58–66, Sofia, Bulgaria. Association for Computational Linguistics.
- R Vallikannu;V Kanpur Rani;Bc Kavitha;P Sankar. 2023. An analysis of situational intelligence for first responders in military. In 2023 International Conference on Artificial Intelligence and Applications (ICAIA) Alliance Technology Conference (ATCON-1).
- Julia Santucci. 2022. After afghanistan: Intelligence analysis and us military missions. Survival, pages 157–178.
- Philip A Schrodt. 2012. Precedents, progress, and prospects in political event data. International Interactions, 38(4):546–569.
- CasEE: A joint learning framework with cascade decoding for overlapping event extraction. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 164–174, Online. Association for Computational Linguistics.
- A hybrid detection and generation framework with separate encoders for event extraction. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, pages 3163–3180, Dubrovnik, Croatia. Association for Computational Linguistics.
- Literary event detection. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 3623–3634, Florence, Italy. Association for Computational Linguistics.
- The role of emergency preparedness exercises in the response to a mass casualty terrorist incident: a mixed methods study. International journal of disaster risk reduction, 46:101503.
- DocEE: A large-scale and fine-grained benchmark for document-level event extraction. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 3970–3982, Seattle, United States. Association for Computational Linguistics.
- Joint document-level event extraction via token-token bidirectional event completed graph. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 10481–10492, Toronto, Canada. Association for Computational Linguistics.
- Research on the construction of event evolutionary graph of network public opinion. Chinese Journal of Medical Library and Information Science, pages 17–23.
- Query and extract: Refining event extraction as type-oriented binary decoding. In Findings of the Association for Computational Linguistics: ACL 2022, pages 169–182, Dublin, Ireland. Association for Computational Linguistics.
- Maven-arg: Completing the puzzle of all-in-one event understanding dataset with event argument annotation.
- MAVEN: A massive general domain event detection dataset. CoRR, abs/2004.13590.
- CLEVE: Contrastive Pre-training for Event Extraction. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 6283–6297, Online. Association for Computational Linguistics.
- Dynamic fusion of emergency response plan based on domain knowledge graph. Computer Systems and Applications, pages 1–13.
- A pipeline-based multimodal military event argument extraction framework. In CCKS 2022 - Evaluation Track, pages 21–29, Singapore. Springer Nature Singapore.
- Learning from a friend: Improving event extraction via self-training with feedback from Abstract Meaning Representation. In Findings of the Association for Computational Linguistics: ACL 2023, pages 10421–10437, Toronto, Canada. Association for Computational Linguistics.
- DCFEE: A document-level Chinese financial event extraction system based on automatically labeled training data. In Proceedings of ACL 2018, System Demonstrations, pages 50–55, Melbourne, Australia. Association for Computational Linguistics.
- Document-level event extraction via parallel prediction networks. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 6298–6308, Online. Association for Computational Linguistics.
- Exploring pre-trained language models for event extraction and generation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 5284–5294, Florence, Italy. Association for Computational Linguistics.
- Method of emergency response force planning. Command Control and Simulation, pages 114–118.
- An AMR-based link prediction approach for document-level event argument extraction. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 12876–12889, Toronto, Canada. Association for Computational Linguistics.
- Research on the construction of major emergencies event centric konwlege graph. Library and Information Service, 65(133-140).
- Llmaaa: Making large language models as active annotators.
- Event extraction for military target motion in open-source military news. In 2022 International Conference on Artificial Intelligence and Computer Information Technology (AICIT), pages 1–4.
- Doc2edag: An end-to-end document-level framework for chinese financial event extraction. CoRR, abs/1904.07535.
- Mengna Zhu (2 papers)
- Zijie Xu (9 papers)
- Kaisheng Zeng (17 papers)
- Kaiming Xiao (2 papers)
- Mao Wang (30 papers)
- Wenjun Ke (9 papers)
- Hongbin Huang (4 papers)