Improving Event Definition Following For Zero-Shot Event Detection (2403.02586v1)
Abstract: Existing approaches on zero-shot event detection usually train models on datasets annotated with known event types, and prompt them with unseen event definitions. These approaches yield sporadic successes, yet generally fall short of expectations. In this work, we aim to improve zero-shot event detection by training models to better follow event definitions. We hypothesize that a diverse set of event types and definitions are the key for models to learn to follow event definitions while existing event extraction datasets focus on annotating many high-quality examples for a few event types. To verify our hypothesis, we construct an automatically generated Diverse Event Definition (DivED) dataset and conduct comparative studies. Our experiments reveal that a large number of event types (200) and diverse event definitions can significantly boost event extraction performance; on the other hand, the performance does not scale with over ten examples per event type. Beyond scaling, we incorporate event ontology information and hard-negative samples during training, further boosting the performance. Based on these findings, we fine-tuned a LLaMA-2-7B model on our DivED dataset, yielding performance that surpasses SOTA LLMs like GPT-3.5 across three open benchmarks on zero-shot event detection.
- RelationPrompt: Leveraging Prompts to Generate Synthetic Data for Zero-Shot Relation Triplet Extraction. In Findings of the Association for Computational Linguistics: ACL 2022.
- The automatic content extraction (ACE) program – tasks, data, and evaluation. In Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04), Lisbon, Portugal. European Language Resources Association (ELRA).
- The automatic content extraction (ace) program-tasks, data, and evaluation. In Lrec, volume 2, pages 837–840. Lisbon.
- Xinya Du and Claire Cardie. 2020. Event extraction by answering (almost) natural questions. arXiv preprint arXiv:2004.13625.
- Exploring the feasibility of chatgpt for event extraction. arXiv preprint arXiv:2303.03836.
- Exploring the Feasibility of ChatGPT for Event Extraction.
- Making Pre-trained Language Models Better Few-shot Learners. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers).
- Deep structured neural network for event temporal relation extraction. arXiv preprint arXiv:1909.10094.
- Generate, Annotate, and Learn: Generative Models Advance Self-Training and Knowledge Distillation.
- A reevaluation of event extraction: Past, present, and future challenges.
- Kung-Hsiang Huang and Nanyun Peng. 2020. Document-level event extraction with efficient end-to-end learning of cross-event dependencies. arXiv preprint arXiv:2010.12787.
- Biomedical event extraction with hierarchical knowledge graphs. arXiv preprint arXiv:2009.09335.
- Zero-shot transfer learning for event extraction. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2160–2170, Melbourne, Australia. Association for Computational Linguistics.
- Zero-Shot Transfer Learning for Event Extraction. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).
- Exploiting asymmetry for synthetic training data generation: SynthIE and the case of information extraction. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing.
- Data Augmentation using Pre-trained Transformer Models. In Proceedings of the 2nd Workshop on Life-long Learning for Spoken Language Systems.
- Po-Nien Kung and Nanyun Peng. 2023. Do models really learn to follow instructions? an empirical study of instruction tuning. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 1317–1328, Toronto, Canada. Association for Computational Linguistics.
- Neural data augmentation via example extrapolation. arXiv preprint 2102.01335.
- Evaluating ChatGPT’s Information Extraction Capabilities: An Assessment of Performance, Explainability, Calibration, and Faithfulness.
- Cross-media structured common space for multimedia event extraction. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 2557–2568, Online. Association for Computational Linguistics.
- Cross-media structured common space for multimedia event extraction. arXiv preprint arXiv:2005.02472.
- A joint neural model for information extraction with global features. In Proceedings of the 58th annual meeting of the association for computational linguistics, pages 7999–8009.
- Summarization as indirect supervision for relation extraction. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 6575–6594, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Zero-shot event extraction via transfer learning: Challenges and insights. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pages 322–332.
- Parameter-Efficient Low-Resource Dialogue State Tracking by Prompt Tuning. In INTERSPEECH 2023.
- DICE: Data-efficient clinical event extraction with generative models. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 15898–15917, Toronto, Canada. Association for Computational Linguistics.
- Star: Improving low-resource information extraction by structure-to-text data generation with large language models. Proceedings of the AAAI Conference on Artificial Intelligence.
- Generating Training Data with Language Models: Towards Zero-Shot Language Understanding.
- Joint event extraction via recurrent neural networks. In Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies, pages 300–309.
- Thien Huu Nguyen and Ralph Grishman. 2015. Event detection and domain adaptation with convolutional neural networks. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pages 365–371.
- OpenAI. 2021. ChatGPT: Large-scale language model. Accessed: June 17, 2023.
- Structured prediction as translation between augmented natural languages. arXiv preprint arXiv:2101.05779.
- Geneva: Benchmarking generalizability for event argument extraction with hundreds of event types and argument roles. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 3664–3686.
- GENEVA: Benchmarking generalizability for event argument extraction with hundreds of event types and argument roles. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 3664–3686, Toronto, Canada. Association for Computational Linguistics.
- Textual Entailment for Event Argument Extraction: Zero- and Few-Shot with Multi-Source Learning. In Findings of the Association for Computational Linguistics: NAACL 2022.
- Timo Schick and Hinrich Schütze. 2021. Generating Datasets with Pretrained Language Models. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.
- From light to rich ere: annotation of entities, relations, and events. In Proceedings of the the 3rd Workshop on EVENTS: Definition, Detection, Coreference, and Representation, pages 89–98.
- The darpa wikidata overlay: Wikidata as an ontology for natural language processing. In Workshop on Interoperable Semantic Annotation (ISA-19), page 1.
- Joint end-to-end semantic proto-role labeling. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 723–736, Toronto, Canada. Association for Computational Linguistics.
- Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288.
- Mee: A novel multilingual event extraction dataset. arXiv preprint arXiv:2211.05955.
- Entity, relation, and event extraction with contextualized span representations. arXiv preprint arXiv:1909.03546.
- Code4struct: Code generation for few-shot event structure prediction. arXiv preprint arXiv:2210.12810.
- Zero-shot information extraction via chatting with chatgpt. arXiv preprint arXiv:2302.10205.
- Can NLI Provide Proper Indirect Supervision for Low-resource Biomedical Relation Extraction? In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).
- ZeroGen: Efficient Zero-shot Learning via Dataset Generation.
- Did you read the instructions? rethinking the effectiveness of task definitions in instruction learning. arXiv preprint arXiv:2306.01150.
- Glen: General-purpose event detection for thousands of types. arXiv preprint arXiv:2303.09093.
- Zero-shot Label-Aware Event Trigger and Argument Classification. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021.
- Efficient zero-shot event extraction with context-definition alignment. arXiv preprint arXiv:2211.05156.
- Zefan Cai (26 papers)
- Po-Nien Kung (8 papers)
- Ashima Suvarna (8 papers)
- Mingyu Derek Ma (27 papers)
- Hritik Bansal (38 papers)
- Baobao Chang (80 papers)
- P. Jeffrey Brantingham (19 papers)
- Wei Wang (1793 papers)
- Nanyun Peng (205 papers)