Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

When Reasoning Meets Information Aggregation: A Case Study with Sports Narratives (2406.12084v2)

Published 17 Jun 2024 in cs.CL and cs.AI

Abstract: Reasoning is most powerful when an LLM accurately aggregates relevant information. We examine the critical role of information aggregation in reasoning by requiring the LLM to analyze sports narratives. To succeed at this task, an LLM must infer points from actions, identify related entities, attribute points accurately to players and teams, and compile key statistics to draw conclusions. We conduct comprehensive experiments with real NBA basketball data and present SportsGen, a new method to synthesize game narratives. By synthesizing data, we can rigorously evaluate LLMs' reasoning capabilities under complex scenarios with varying narrative lengths and density of information. Our findings show that most models, including GPT-4o, often fail to accurately aggregate basketball scores due to frequent scoring patterns. Open-source models like Llama-3 further suffer from significant score hallucinations. Finally, the effectiveness of reasoning is influenced by narrative complexity, information density, and domain-specific terms, highlighting the challenges in analytical reasoning tasks.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Yebowen Hu (9 papers)
  2. Kaiqiang Song (32 papers)
  3. Sangwoo Cho (22 papers)
  4. Xiaoyang Wang (134 papers)
  5. Wenlin Yao (38 papers)
  6. Hassan Foroosh (48 papers)
  7. Dong Yu (328 papers)
  8. Fei Liu (232 papers)
Citations (4)