Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Unveiling Divergent Inductive Biases of LLMs on Temporal Data (2404.01453v1)

Published 1 Apr 2024 in cs.CL and cs.AI

Abstract: Unraveling the intricate details of events in natural language necessitates a subtle understanding of temporal dynamics. Despite the adeptness of LLMs in discerning patterns and relationships from data, their inherent comprehension of temporal dynamics remains a formidable challenge. This research meticulously explores these intrinsic challenges within LLMs, with a specific emphasis on evaluating the performance of GPT-3.5 and GPT-4 models in the analysis of temporal data. Employing two distinct prompt types, namely Question Answering (QA) format and Textual Entailment (TE) format, our analysis probes into both implicit and explicit events. The findings underscore noteworthy trends, revealing disparities in the performance of GPT-3.5 and GPT-4. Notably, biases toward specific temporal relationships come to light, with GPT-3.5 demonstrating a preference for "AFTER'' in the QA format for both implicit and explicit events, while GPT-4 leans towards "BEFORE''. Furthermore, a consistent pattern surfaces wherein GPT-3.5 tends towards "TRUE'', and GPT-4 exhibits a preference for "FALSE'' in the TE format for both implicit and explicit events. This persistent discrepancy between GPT-3.5 and GPT-4 in handling temporal data highlights the intricate nature of inductive bias in LLMs, suggesting that the evolution of these models may not merely mitigate bias but may introduce new layers of complexity.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (16)
  1. Extraction of temporal relations from clinical free text: A systematic review of current approaches. Journal of Biomedical Informatics, 108:103488.
  2. Prafulla Kumar Choubey and Ruihong Huang. 2017. A sequential model for classifying temporal relations between intra-sentence events. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 1796–1802.
  3. Decomposing and recomposing event structure. Transactions of the Association for Computational Linguistics, 10:17–34.
  4. Building timelines from narrative clinical records: initial results based-on deep natural language understanding. In Proceedings of BioNLP 2011 workshop, pages 146–154.
  5. A survey on event-based news narrative extraction. ACM Computing Surveys, 55(14s):1–39.
  6. What makes the story forward? inferring commonsense explanations as prompts for future event generation. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 1098–1109.
  7. An improved neural baseline for temporal relation extraction. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 6203–6209.
  8. A multi-axis annotation scheme for event temporal relations. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1318–1328.
  9. Cogcomptime: A tool for understanding time in natural language. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 72–77.
  10. OpenAI. 2023. GPT-4 technical report. arXiv preprint arXiv:2303.08774.
  11. The timebank corpus. In Corpus linguistics, volume 2003, page 40. Lancaster, UK.
  12. Semeval-2010 task 13: Tempeval-2. In Proceedings of the 5th international workshop on semantic evaluation, pages 57–62.
  13. Event phase oriented news summarization. World Wide Web, 21:1069–1092.
  14. Joint constrained learning for event-event relation extraction. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 696–706.
  15. Extracting temporal event relation with syntax-guided graph transformer. In Findings of the Association for Computational Linguistics: NAACL 2022, pages 379–390.
  16. Temporal reasoning on implicit events from distant supervision. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1361–1371.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Sindhu Kishore (1 paper)
  2. Hangfeng He (26 papers)
Citations (2)