5W1H Extraction With Large Language Models (2405.16150v1)
Abstract: The extraction of essential news elements through the 5W1H framework (\textit{What}, \textit{When}, \textit{Where}, \textit{Why}, \textit{Who}, and \textit{How}) is critical for event extraction and text summarization. The advent of LLMs such as ChatGPT presents an opportunity to address language-related tasks through simple prompts without fine-tuning models with much time. While ChatGPT has encountered challenges in processing longer news texts and analyzing specific attributes in context, especially answering questions about \textit{What}, \textit{Why}, and \textit{How}. The effectiveness of extraction tasks is notably dependent on high-quality human-annotated datasets. However, the absence of such datasets for the 5W1H extraction increases the difficulty of fine-tuning strategies based on open-source LLMs. To address these limitations, first, we annotate a high-quality 5W1H dataset based on four typical news corpora (\textit{CNN/DailyMail}, \textit{XSum}, \textit{NYT}, \textit{RA-MDS}); second, we design several strategies from zero-shot/few-shot prompting to efficient fine-tuning to conduct 5W1H aspects extraction from the original news documents. The experimental results demonstrate that the performance of the fine-tuned models on our labelled dataset is superior to the performance of ChatGPT. Furthermore, we also explore the domain adaptation capability by testing the source-domain (e.g. NYT) models on the target domain corpus (e.g. CNN/DailyMail) for the task of 5W1H extraction.
- F. Hamborg, N. Meuschke, and B. Gipp, “Bias-aware news analysis using matrix-based news aggregation,” International Journal on Digital Libraries, vol. 21, no. 2, pp. 129–147, 2020.
- F. Hamborg, C. Breitinger, and B. Gipp, “Giveme5w1h: A universal system for extracting main events from news articles,” arXiv preprint arXiv:1909.02766, 2019.
- F. Hamborg, N. Meuschke, and B. Gipp, “Matrix-based news aggregation: exploring different news perspectives,” in 2017 ACM/IEEE Joint Conference on Digital Libraries (JCDL). IEEE, 2017, pp. 1–10.
- P. Li, L. Bing, W. Lam, H. Li, and Y. Liao, “Reader-aware multi-document summarization via sparse coding,” arXiv preprint arXiv:1504.07324, 2015.
- F. Hamborg, C. Breitinger, M. Schubotz, S. Lachnit, and B. Gipp, “Extraction of main event descriptors from news articles by answering the journalistic five w and one h questions,” in Proceedings of the 18th ACM/IEEE on Joint Conference on Digital Libraries, 2018, pp. 339–340.
- L. Ouyang, J. Wu, X. Jiang, D. Almeida, C. Wainwright, P. Mishkin, C. Zhang, S. Agarwal, K. Slama, A. Ray et al., “Training language models to follow instructions with human feedback,” Advances in Neural Information Processing Systems, vol. 35, pp. 27 730–27 744, 2022.
- D. Gao, H. Wang, Y. Li, X. Sun, Y. Qian, B. Ding, and J. Zhou, “Text-to-sql empowered by large language models: A benchmark evaluation,” arXiv preprint arXiv:2308.15363, 2023.
- Y. Chen, Y. Liu, F. Meng, Y. Chen, J. Xu, and J. Zhou, “Improving translation faithfulness of large language models via augmenting instructions,” arXiv preprint arXiv:2308.12674, 2023.
- J. Gao, H. Zhao, C. Yu, and R. Xu, “Exploring the feasibility of chatgpt for event extraction. arxiv,” arXiv preprint arXiv:2303.03836, 2023.
- S. Yang, D. Feng, L. Qiao, Z. Kan, and D. Li, “Exploring pre-trained language models for event extraction and generation,” in Proceedings of the 57th annual meeting of the association for computational linguistics, 2019, pp. 5284–5294.
- C. Lou, J. Gao, C. Yu, W. Wang, H. Zhao, W. Tu, and R. Xu, “Translation-based implicit annotation projection for zero-shot cross-lingual event argument extraction,” in Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2022, pp. 2076–2081.
- P. Li, L. Bing, and W. Lam, “Reader-aware multi-document summarization: An enhanced model and the first dataset,” arXiv preprint arXiv:1708.01065, 2017.
- X. L. Li and P. Liang, “Prefix-tuning: Optimizing continuous prompts for generation,” arXiv preprint arXiv:2101.00190, 2021.
- E. J. Hu, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wang, L. Wang, and W. Chen, “Lora: Low-rank adaptation of large language models,” arXiv preprint arXiv:2106.09685, 2021.
- T. Dettmers, A. Pagnoni, A. Holtzman, and L. Zettlemoyer, “Qlora: Efficient finetuning of quantized llms,” arXiv preprint arXiv:2305.14314, 2023.
- F. Hamborg, S. Lachnit, M. Schubotz, T. Hepp, and B. Gipp, “Giveme5w: main event retrieval from news articles by extraction of the five journalistic w questions,” in International conference on information. Springer, 2018, pp. 356–366.
- A. Das, S. Bandyaopadhyay, and B. Gambäck, “The 5w structure for sentiment summarization-visualization-tracking,” in Computational Linguistics and Intelligent Text Processing: 13th International Conference, CICLing 2012, New Delhi, India, March 11-17, 2012, Proceedings, Part I 13. Springer, 2012, pp. 540–555.
- K. Parton, K. McKeown, R. E. Coyne, M. T. Diab, R. Grishman, D. Hakkani-Tür, M. Harper, H. Ji, W. Y. Ma, A. Meyers et al., “Who, what, when, where, why? comparing multiple approaches to the cross-lingual 5w task,” 2009.
- S. Yaman, D. Hakkani-Tür, G. Tur, R. Grishman, M. Harper, K. R. McKeown, A. Meyers, and K. Sharma, “Classification-based strategies for combining multiple 5-w question answering systems,” in Tenth Annual Conference of the International Speech Communication Association. Citeseer, 2009.
- S. Srivastava, G. Singh, S. Matsumoto, A. Raz, P. Costa, J. Poore, and Z. Yao, “Mailex: Email event and argument extraction,” arXiv preprint arXiv:2305.13469, 2023.
- B. Li, G. Fang, Y. Yang, Q. Wang, W. Ye, W. Zhao, and S. Zhang, “Evaluating chatgpt’s information extraction capabilities: An assessment of performance, explainability, calibration, and faithfulness,” arXiv preprint arXiv:2304.11633, 2023.
- J. Wei, M. Bosma, V. Y. Zhao, K. Guu, A. W. Yu, B. Lester, N. Du, A. M. Dai, and Q. V. Le, “Finetuned language models are zero-shot learners,” arXiv preprint arXiv:2109.01652, 2021.
- J. Wei, X. Wang, D. Schuurmans, M. Bosma, F. Xia, E. Chi, Q. V. Le, D. Zhou et al., “Chain-of-thought prompting elicits reasoning in large language models,” Advances in Neural Information Processing Systems, vol. 35, pp. 24 824–24 837, 2022.
- H. Touvron, T. Lavril, G. Izacard, X. Martinet, M.-A. Lachaux, T. Lacroix, B. Rozière, N. Goyal, E. Hambro, F. Azhar et al., “Llama: Open and efficient foundation language models,” arXiv preprint arXiv:2302.13971, 2023.
- W.-L. Chiang, Z. Li, Z. Lin, Y. Sheng, Z. Wu, H. Zhang, L. Zheng, S. Zhuang, Y. Zhuang, J. E. Gonzalez et al., “Vicuna: An open-source chatbot impressing gpt-4 with 90%* chatgpt quality,” See https://vicuna. lmsys. org (accessed 14 April 2023), 2023.
- Z. Yang, Z. Dai, Y. Yang, J. Carbonell, R. R. Salakhutdinov, and Q. V. Le, “Xlnet: Generalized autoregressive pretraining for language understanding,” Advances in neural information processing systems, vol. 32, 2019.
- G. Klein, F. Hernandez, V. Nguyen, and J. Senellart, “The opennmt neural machine translation toolkit: 2020 edition,” in Proceedings of the 14th Conference of the Association for Machine Translation in the Americas (Volume 1: Research Track), 2020, pp. 102–109.
- R. Nallapati, B. Zhou, C. Gulcehre, B. Xiang et al., “Abstractive text summarization using sequence-to-sequence rnns and beyond,” arXiv preprint arXiv:1602.06023, 2016.
- S. Narayan, S. B. Cohen, and M. Lapata, “Don’t give me the details, just the summary! topic-aware convolutional neural networks for extreme summarization,” arXiv preprint arXiv:1808.08745, 2018.
- E. Sandhaus, “The new york times annotated corpus,” Linguistic Data Consortium, Philadelphia, vol. 6, no. 12, p. e26752, 2008.
- C.-Y. Lin, “Rouge: A package for automatic evaluation of summaries,” in Text summarization branches out, 2004, pp. 74–81.
- K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu, “Bleu: a method for automatic evaluation of machine translation,” in Proceedings of the 40th annual meeting of the Association for Computational Linguistics, 2002, pp. 311–318.
- Yang Cao (295 papers)
- Yangsong Lan (1 paper)
- Feiyan Zhai (1 paper)
- Piji Li (75 papers)