Quantifying memorization versus generalized reasoning in LLM mathematical problem solving
Determine the proportion of apparent mathematical reasoning exhibited by large language models that is attributable to memorization of training data and shallow heuristics, as opposed to learned general principles of reasoning that generalize beyond the training examples when solving high-school-level word problems.
References
However, it is not clear how much of these apparent reasoning capabilities can be attributed to memorization of the training material combined with shallow heuristics, as opposed to having learned actual general principles of reasoning by generalizing from the training examples.
— Large Language Models and Mathematical Reasoning Failures
(2502.11574 - Boye et al., 17 Feb 2025) in Section 1 (Introduction)