Origin of CoT success in LLM mathematical reasoning
Determine whether the observed strong performance of large language models on mathematical problems under Chain-of-Thought prompting primarily arises from search-based strategies, rote procedural execution, or rule-consistent logical inference.
References
LLMs demonstrate strong performance on mathematical problems when prompted with Chain-of-Thought (CoT), yet it remains unclear whether this success stems from search, rote procedures, or rule-consistent reasoning.
— DAG-Math: Graph-Guided Mathematical Reasoning in LLMs
(2510.19842 - Zhang et al., 19 Oct 2025) in Abstract (page 1)