Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

EVIT: Event-Oriented Instruction Tuning for Event Reasoning (2404.11978v1)

Published 18 Apr 2024 in cs.CL

Abstract: Events refer to specific occurrences, incidents, or happenings that take place under a particular background. Event reasoning aims to infer events according to certain relations and predict future events. The cutting-edge techniques for event reasoning play a crucial role in various natural language processing applications. LLMs have made significant advancements in event reasoning owing to their wealth of knowledge and reasoning capabilities. However, smaller instruction-tuned models currently in use do not consistently demonstrate exceptional proficiency in managing these tasks. This discrepancy arises from the absence of explicit modeling of events and the interconnections of them within their instruction data. Consequently, these models face challenges in comprehending event structures and semantics while struggling to bridge the gap between their interpretations and human understanding of events. Additionally, their limitations in grasping event relations lead to constrained event reasoning abilities to effectively deduce and incorporate pertinent event knowledge. In this paper, we propose Event-Oriented Instruction Tuning (EvIT) to train our LLM. Specifically, we first propose a novel structure named event quadruple which contains the structure and semantics of events and is complete in the event representation. We then design event-relation learning based on the structures. We encapsulate the learning into the instruction-tuning formulation to better stimulate the event reasoning capacity of our model. We design a heuristic unsupervised method to mine event quadruple from a large-scale corpus. At last, we finetune a Llama model on our Event-Oriented Instruction Tuning. We conduct extensive experiments on event reasoning tasks on several datasets. Automatic and human evaluations demonstrate EvIT achieves competitive performances on event reasoning.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (39)
  1. Vicuna: An open-source chatbot impressing gpt-4 with 90%* chatgpt quality.
  2. Scaling instruction-finetuned language models. arXiv preprint arXiv:2210.11416.
  3. Free dolly: Introducing the world’s first truly open instruction-tuned llm.
  4. Chatlaw: Open-source legal large language model with integrated external knowledge bases. arXiv preprint arXiv:2306.16092.
  5. e-care: a new dataset for exploring explainable causal reasoning. arXiv preprint arXiv:2205.05849.
  6. Story ending generation with incremental encoding and commonsense knowledge. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, pages 6473–6480.
  7. Ester: A machine reading comprehension dataset for reasoning about event semantic relations. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 7543–7559.
  8. Chin-Yew Lin. 2004. Rouge: A package for automatic evaluation of summaries. In Text summarization branches out, pages 74–81.
  9. Alexis Mitchell. 2005. The automatic content extraction (ace) program-tasks, data, and evaluation.
  10. A corpus and cloze evaluation for deeper understanding of commonsense stories. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 839–849.
  11. Crosslingual generalization through multitask finetuning. arXiv preprint arXiv:2211.01786.
  12. R OpenAI. 2023. Gpt-4 technical report. arXiv, pages 2303–08774.
  13. Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35:27730–27744.
  14. Recognizing emotion cause in conversations. Cognitive Computation, 13:1317–1332.
  15. The penn discourse treebank 2.0. In LREC.
  16. Counterfactual story reasoning and generation. arXiv preprint arXiv:1909.04076.
  17. Back to the future: Unsupervised backprop-based decoding for counterfactual and abductive commonsense reasoning. arXiv preprint arXiv:2010.05906.
  18. Socialiqa: Commonsense reasoning about social interactions. arXiv preprint arXiv:1904.09728.
  19. Event-qa: A dataset for event-centric question answering over knowledge graphs. In Proceedings of the 29th ACM international conference on information & knowledge management, pages 3157–3164.
  20. Toolalpaca: Generalized tool learning for language models with 3000 simulated cases. arXiv preprint arXiv:2306.05301.
  21. Eveval: A comprehensive evaluation of event semantics for large language models. arXiv preprint arXiv:2305.15268.
  22. Unievent: Unified generative model with multi-dimensional prefix for zero-shot event-relational reasoning. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 7088–7102.
  23. Stanford alpaca: An instruction-following llama model. https://github.com/tatsu-lab/stanford_alpaca.
  24. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971.
  25. Zeno Vendler. 1957. Verbs and times. The philosophical review, pages 143–160.
  26. Wizardlm: Empowering large language models to follow complex instructions. arXiv preprint arXiv:2304.12244.
  27. A temporal semantic search system for traditional chinese medicine based on temporal knowledge graphs. In Semantic Technology: 9th Joint International Conference, JIST 2019, Hangzhou, China, November 25–27, 2019, Revised Selected Papers 9, pages 13–20. Springer.
  28. Fingpt: Open-source financial large language models. arXiv preprint arXiv:2306.06031.
  29. Towards fine-grained causal reasoning and qa. arXiv preprint arXiv:2204.07408.
  30. Cocolm: Complex commonsense enhanced language model with discourse relations. arXiv preprint arXiv:2012.15643.
  31. Huatuogpt, towards taming language model to be a doctor. arXiv preprint arXiv:2305.15075.
  32. Aser: A large-scale eventuality knowledge graph. In Proceedings of the web conference 2020, pages 201–211.
  33. Deepke-llm: A large language model based knowledge extraction toolkit.
  34. Bertscore: Evaluating text generation with bert. arXiv preprint arXiv:1904.09675.
  35. Liang Zhao. 2021. Event prediction in the big data era: A systematic survey. ACM Computing Surveys (CSUR), 54(5):1–37.
  36. " going on a vacation" takes longer than" going for a walk": A study of temporal commonsense understanding. arXiv preprint arXiv:1909.03065.
  37. Eventbert: A pre-trained model for event correlation reasoning. In Proceedings of the ACM Web Conference 2022, pages 850–859.
  38. Claret: Pre-training a correlation-aware context-to-event transformer for event-centric generation and classification. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2559–2575.
  39. Aligning books and movies: Towards story-like visual explanations by watching movies and reading books. In Proceedings of the IEEE international conference on computer vision, pages 19–27.
Citations (3)

Summary

We haven't generated a summary for this paper yet.