Generalize findings to proprietary reasoning models (Claude, ChatGPT)

Determine whether the observed mismatch between human-perceived causal dependencies and the model’s actual causal dependencies in reasoning texts—identified for DeepSeek-R1 0528 (671B) and Qwen-3 32B on GSM8K—also holds for proprietary systems such as Anthropic Claude and OpenAI ChatGPT, despite the lack of direct access to their internal reasoning processes.

Background

The study evaluates whether humans can identify which earlier steps in AI-generated reasoning texts causally influence later steps, using counterfactual step-removal on two open reasoning models (DeepSeek-R1 and Qwen-3). Results show humans perform near chance, indicating a divergence between perceived narrative and the model’s actual process.

Because the methodology relies on direct intervention and regeneration of reasoning steps, it requires access to the model’s reasoning traces. The authors note that proprietary models (e.g., Claude, ChatGPT) typically do not provide such access, leaving it unresolved whether the documented human–model mismatch generalizes to these systems.

References

While these models provide valuable insights, extending these findings to proprietary systems like Claude or ChatGPT remains an open question, as we lack direct access to examine their reasoning processes.

— Humans Perceive Wrong Narratives from AI Reasoning Texts (2508.16599 - Levy et al., 9 Aug 2025) in Limitations — Model Selection

Generalize findings to proprietary reasoning models (Claude, ChatGPT)

Sponsor

Background

References

Related Problems