Benchmarking Temporal Reasoning and Alignment Across Chinese Dynasties

Published 24 Feb 2025 in cs.CL | (2502.16922v1)

Abstract: Temporal reasoning is fundamental to human cognition and is crucial for various real-world applications. While recent advances in LLMs have demonstrated promising capabilities in temporal reasoning, existing benchmarks primarily rely on rule-based construction, lack contextual depth, and involve a limited range of temporal entities. To address these limitations, we introduce Chinese Time Reasoning (CTM), a benchmark designed to evaluate LLMs on temporal reasoning within the extensive scope of Chinese dynastic chronology. CTM emphasizes cross-entity relationships, pairwise temporal alignment, and contextualized and culturally-grounded reasoning, providing a comprehensive evaluation. Extensive experimental results reveal the challenges posed by CTM and highlight potential avenues for improvement.