Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Living in the Moment: Can Large Language Models Grasp Co-Temporal Reasoning? (2406.09072v1)

Published 13 Jun 2024 in cs.CL

Abstract: Temporal reasoning is fundamental for LLMs to comprehend the world. Current temporal reasoning datasets are limited to questions about single or isolated events, falling short in mirroring the realistic temporal characteristics involving concurrent nature and intricate temporal interconnections. In this paper, we introduce CoTempQA, a comprehensive co-temporal Question Answering (QA) benchmark containing four co-temporal scenarios (Equal, Overlap, During, Mix) with 4,748 samples for evaluating the co-temporal comprehension and reasoning abilities of LLMs. Our extensive experiments reveal a significant gap between the performance of current LLMs and human-level reasoning on CoTempQA tasks. Even when enhanced with Chain of Thought (CoT) methodologies, models consistently struggle with our task. In our preliminary exploration, we discovered that mathematical reasoning plays a significant role in handling co-temporal events and proposed a strategy to boost LLMs' co-temporal reasoning from a mathematical perspective. We hope that our CoTempQA datasets will encourage further advancements in improving the co-temporal reasoning capabilities of LLMs. Our code is available at https://github.com/zhaochen0110/Cotempqa.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Zhaochen Su (11 papers)
  2. Juntao Li (89 papers)
  3. Jun Zhang (1008 papers)
  4. Tong Zhu (43 papers)
  5. Xiaoye Qu (62 papers)
  6. Pan Zhou (220 papers)
  7. Yan Bowen (1 paper)
  8. Yu Cheng (354 papers)
  9. Min Zhang (630 papers)
Citations (10)
Github Logo Streamline Icon: https://streamlinehq.com

GitHub