Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Causal Reasoning of Entities and Events in Procedural Texts (2301.10896v3)

Published 26 Jan 2023 in cs.CL

Abstract: Entities and events are crucial to natural language reasoning and common in procedural texts. Existing work has focused either exclusively on entity state tracking (e.g., whether a pan is hot) or on event reasoning (e.g., whether one would burn themselves by touching the pan), while these two tasks are often causally related. We propose CREPE, the first benchmark on causal reasoning of event plausibility and entity states. We show that most LLMs, including GPT-3, perform close to chance at .35 F1, lagging far behind human at .87 F1. We boost model performance to .59 F1 by creatively representing events as programming languages while prompting LLMs pretrained on code. By injecting the causal relations between entities and events as intermediate reasoning steps in our representation, we further boost the performance to .67 F1. Our findings indicate not only the challenge that CREPE brings for LLMs, but also the efficacy of code-like prompting combined with chain-of-thought prompting for multihop event reasoning.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Li Zhang (693 papers)
  2. Hainiu Xu (12 papers)
  3. Yue Yang (146 papers)
  4. Shuyan Zhou (28 papers)
  5. Weiqiu You (8 papers)
  6. Manni Arora (1 paper)
  7. Chris Callison-Burch (102 papers)
Citations (28)

Summary

We haven't generated a summary for this paper yet.