Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

E-KAR: A Benchmark for Rationalizing Natural Language Analogical Reasoning (2203.08480v1)

Published 16 Mar 2022 in cs.CL and cs.AI

Abstract: The ability to recognize analogies is fundamental to human cognition. Existing benchmarks to test word analogy do not reveal the underneath process of analogical reasoning of neural models. Holding the belief that models capable of reasoning should be right for the right reasons, we propose a first-of-its-kind Explainable Knowledge-intensive Analogical Reasoning benchmark (E-KAR). Our benchmark consists of 1,655 (in Chinese) and 1,251 (in English) problems sourced from the Civil Service Exams, which require intensive background knowledge to solve. More importantly, we design a free-text explanation scheme to explain whether an analogy should be drawn, and manually annotate them for each and every question and candidate answer. Empirical results suggest that this benchmark is very challenging for some state-of-the-art models for both explanation generation and analogical question answering tasks, which invites further research in this area.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (10)
  1. Jiangjie Chen (46 papers)
  2. Rui Xu (198 papers)
  3. Ziquan Fu (5 papers)
  4. Wei Shi (116 papers)
  5. Zhongqiao Li (1 paper)
  6. Xinbo Zhang (6 papers)
  7. Changzhi Sun (18 papers)
  8. Lei Li (1293 papers)
  9. Yanghua Xiao (151 papers)
  10. Hao Zhou (351 papers)
Citations (33)