Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

ULTRA: Unleash LLMs' Potential for Event Argument Extraction through Hierarchical Modeling and Pair-wise Refinement (2401.13218v1)

Published 24 Jan 2024 in cs.CL

Abstract: Structural extraction of events within discourse is critical since it avails a deeper understanding of communication patterns and behavior trends. Event argument extraction (EAE), at the core of event-centric understanding, is the task of identifying role-specific text spans (i.e., arguments) for a given event. Document-level EAE (DocEAE) focuses on arguments that are scattered across an entire document. In this work, we explore the capabilities of open source LLMs, i.e., Flan-UL2, for the DocEAE task. To this end, we propose ULTRA, a hierarchical framework that extracts event arguments more cost-effectively -- the method needs as few as 50 annotations and doesn't require hitting costly API endpoints. Further, it alleviates the positional bias issue intrinsic to LLMs. ULTRA first sequentially reads text chunks of a document to generate a candidate argument set, upon which ULTRA learns to drop non-pertinent candidates through self-refinement. We further introduce LEAFER to address the challenge LLMs face in locating the exact boundary of an argument span. ULTRA outperforms strong baselines, which include strong supervised models and ChatGPT, by 9.8% when evaluated by the exact match (EM) metric.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (83)
  1. Constitutional ai: Harmlessness from ai feedback. ArXiv, abs/2212.08073.
  2. How is chatgpt’s behavior changing over time? ArXiv, abs/2307.09009.
  3. Program of thoughts prompting: Disentangling computation from reasoning for numerical reasoning tasks. Transactions on Machine Learning Research.
  4. Event extraction via dynamic multi-pooling convolutional neural networks. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 167–176, Beijing, China. Association for Computational Linguistics.
  5. Palm: Scaling language modeling with pathways. J. Mach. Learn. Res., 24:240:1–240:113.
  6. Deep reinforcement learning from human preferences. In Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc.
  7. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota. Association for Computational Linguistics.
  8. Xinya Du and Claire Cardie. 2020a. Document-level event role filler extraction using multi-granularity contextualized encoding. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 8010–8020, Online. Association for Computational Linguistics.
  9. Xinya Du and Claire Cardie. 2020b. Event extraction by answering (almost) natural questions. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 671–683, Online. Association for Computational Linguistics.
  10. Multi-sentence argument linking. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 8057–8077, Online. Association for Computational Linguistics.
  11. Trapping llm hallucinations using tagged context prompts. ArXiv, abs/2306.06085.
  12. Elena Filatova and Vasileios Hatzivassiloglou. 2004. Event-based extractive summarization. In Text Summarization Branches Out, pages 104–111, Barcelona, Spain. Association for Computational Linguistics.
  13. Ensemble deep learning: A review. Eng. Appl. Artif. Intell., 115:105151.
  14. Benchmarking large language models with augmented instructions for fine-grained information extraction. ArXiv, abs/2310.05092.
  15. Pal: Program-aided language models. arXiv preprint arXiv:2211.10435.
  16. Giveme5w1h: A universal system for extracting main events from news articles. In INRA@RecSys.
  17. Is information extraction solved by chatgpt? an analysis of performance, evaluation criteria, robustness and errors. ArXiv, abs/2305.14450.
  18. A biomedical event extraction method based on fine-grained and attention mechanism. BMC Bioinformatics, 23.
  19. Training compute-optimal large language models. ArXiv, abs/2203.15556.
  20. Large language models are zero-shot rankers for recommender systems. ArXiv, abs/2305.08845.
  21. DEGREE: A data-efficient generation-based event extraction model. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1890–1908, Seattle, United States. Association for Computational Linguistics.
  22. Kevin G. Jamieson and Robert D. Nowak. 2011. Active ranking using pairwise comparisons. ArXiv, abs/1109.3701.
  23. LLM-blender: Ensembling large language models with pairwise ranking and generative fusion. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 14165–14178, Toronto, Canada. Association for Computational Linguistics.
  24. Event schema induction with double graph autoencoders. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2013–2025, Seattle, United States. Association for Computational Linguistics.
  25. Event argument extraction using causal knowledge structures. In Proceedings of the 17th International Conference on Natural Language Processing (ICON), pages 287–296, Indian Institute of Technology Patna, Patna, India. NLP Association of India (NLPAI).
  26. A survey on event-based news narrative extraction. ACM Comput. Surv., 55(14s).
  27. Look at the first sentence: Position bias in question answering. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1109–1121, Online. Association for Computational Linguistics.
  28. A zero-shot and few-shot study of instruction-finetuned large language models applied to clinical and biomedical tasks. ArXiv, abs/2307.12114.
  29. Justin Lee and Sowmya Vajjala. 2022. A neural pairwise ranking model for readability assessment. In Findings of the Association for Computational Linguistics: ACL 2022, pages 3802–3813, Dublin, Ireland. Association for Computational Linguistics.
  30. BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7871–7880, Online. Association for Computational Linguistics.
  31. Evaluating chatgpt’s information extraction capabilities: An assessment of performance, explainability, calibration, and faithfulness. ArXiv, abs/2304.11633.
  32. Timeline summarization based on event graph compression via time-aware optimal transport. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 6443–6456, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
  33. GAIA: A fine-grained multimedia knowledge extraction system. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pages 77–86, Online. Association for Computational Linguistics.
  34. Event extraction for criminal legal text. 2020 IEEE International Conference on Knowledge Graph (ICKG), pages 573–580.
  35. Document-level event argument extraction by conditional generation. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 894–908, Online. Association for Computational Linguistics.
  36. A joint neural model for information extraction with global features. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7999–8009, Online. Association for Computational Linguistics.
  37. Lost in the middle: How language models use long contexts. ArXiv, abs/2307.03172.
  38. POLITICS: Pretraining with same-story article comparison for ideology prediction and stance detection. In Findings of the Association for Computational Linguistics: NAACL 2022, pages 1354–1374, Seattle, United States. Association for Computational Linguistics.
  39. Cross-media event extraction and recommendation. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations, pages 72–76, San Diego, California. Association for Computational Linguistics.
  40. Text2Event: Controllable sequence-to-structure generation for end-to-end event extraction. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 2795–2806, Online. Association for Computational Linguistics.
  41. Selfcheckgpt: Zero-resource black-box hallucination detection for generative large language models. ArXiv, abs/2303.08896.
  42. Event-based summarization using a centrality-as-relevance model. Knowledge and Information Systems, 50:945–968.
  43. Alexander P. D. Mourelatos. 1978. Events, processes, and states. Linguistics and Philosophy, 2:415–434.
  44. Joint event extraction via recurrent neural networks. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 300–309, San Diego, California. Association for Computational Linguistics.
  45. OpenAI. 2023. Gpt-4 technical report. ArXiv, abs/2303.08774.
  46. Training language models to follow instructions with human feedback. ArXiv, abs/2203.02155.
  47. Automatically correcting large language models: Surveying the landscape of diverse self-correction strategies. ArXiv, abs/2308.03188.
  48. Structured prediction as translation between augmented natural languages. In 9th International Conference on Learning Representations, ICLR 2021.
  49. Hierarchical transformers for long document classification. 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), pages 838–844.
  50. Boosted prompt ensembles for large language models. ArXiv, abs/2304.05970.
  51. Horst Pottker. 2003. News and its communicative quality: the inverted pyramid—when and why did it appear? Journalism Studies, 4(4):501–511.
  52. MEE: A novel multilingual event extraction dataset. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 9603–9613, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
  53. Language models are unsupervised multitask learners.
  54. Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res., 21:140:1–140:67.
  55. Ellen Riloff. 1996. Automatically generating extraction patterns from untagged text. In AAAI/IAAI, Vol. 2.
  56. Characterizing information seeking events in health-related social discourse. ArXiv, abs/2308.09156.
  57. Hierarchical Chinese legal event extraction via pedal attention mechanism. In Proceedings of the 28th International Conference on Computational Linguistics, pages 100–113, Barcelona, Spain (Online). International Committee on Computational Linguistics.
  58. Mailex: Email event and argument extraction. ArXiv, abs/2305.13469.
  59. How to fine-tune bert for text classification? In China National Conference on Chinese Computational Linguistics.
  60. Beth M. Sundheim. 1992. Overview of the fourth Message Understanding Evaluation and Conference. In Fourth Message Understanding Conference (MUC-4): Proceedings of a Conference Held in McLean, Virginia, June 16-18, 1992.
  61. Ul2: Unifying language learning paradigms. In International Conference on Learning Representations.
  62. DocEE: A large-scale and fine-grained benchmark for document-level event extraction. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 3970–3982, Seattle, United States. Association for Computational Linguistics.
  63. Attention is all you need. In Neural Information Processing Systems.
  64. Ace (automatic content extraction) english annotation guidelines for events. Linguistic Data Consortium,.
  65. Superglue: A stickier benchmark for general-purpose language understanding systems. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada, pages 3261–3275.
  66. CORD-19: The COVID-19 open research dataset. In Proceedings of the 1st Workshop on NLP for COVID-19 at ACL 2020, Online. Association for Computational Linguistics.
  67. Large language models are not fair evaluators. ArXiv, abs/2305.17926.
  68. Element-aware summarization with large language models: Expert-aligned evaluation and chain-of-thought method. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 8640–8665, Toronto, Canada. Association for Computational Linguistics.
  69. Super-NaturalInstructions: Generalization via declarative instructions on 1600+ NLP tasks. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 5085–5109, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
  70. Emergent abilities of large language models. Trans. Mach. Learn. Res., 2022.
  71. Chain of thought prompting elicits reasoning in large language models. ArXiv, abs/2201.11903.
  72. Zero-shot information extraction via chatting with chatgpt. ArXiv, abs/2302.10205.
  73. Empirical study of zero-shot ner with chatgpt. ArXiv, abs/2310.10035.
  74. Document-level event extraction via heterogeneous graph-based interaction model with a tracker. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 3533–3546, Online. Association for Computational Linguistics.
  75. Few-shot document-level event argument extraction. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 8029–8046, Toronto, Canada. Association for Computational Linguistics.
  76. Zero-shot temporal relation extraction with ChatGPT. In The 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks, pages 92–102, Toronto, Canada. Association for Computational Linguistics.
  77. Generative entity-to-entity stance detection with knowledge graph augmentation. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 9950–9969, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
  78. COUGH: A challenge dataset and models for COVID-19 FAQ retrieval. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 3759–3769, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
  79. MOKA: Moral knowledge augmentation for moral event extraction. ArXiv, abs/2311.09733.
  80. A novel joint biomedical event extraction framework via two-level modeling of documents. Inf. Sci., 550:27–40.
  81. Calibrate before use: Improving few-shot performance of language models. In Proceedings of the 38th International Conference on Machine Learning, ICML 2021, 18-24 July 2021, Virtual Event, volume 139 of Proceedings of Machine Learning Research, pages 12697–12706. PMLR.
  82. Judging llm-as-a-judge with mt-bench and chatbot arena. ArXiv, abs/2306.05685.
  83. Doc2EDAG: An end-to-end document-level framework for Chinese financial event extraction. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 337–346, Hong Kong, China. Association for Computational Linguistics.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Xinliang Frederick Zhang (12 papers)
  2. Carter Blum (2 papers)
  3. Temma Choji (2 papers)
  4. Shalin Shah (15 papers)
  5. Alakananda Vempala (2 papers)
Citations (6)
X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets