Extent of new versus repurposed reasoning capabilities in thinking language models
Determine to what extent thinking language models learn entirely new reasoning capabilities during post-training versus repurpose pre-existing capabilities and representations already present in their base models acquired during pre-training, in order to explain their superior performance on reasoning tasks.
References
Despite consistent performance gains, it remains unclear to what extent thinking models learn entirely new reasoning capabilities or repurpose pre-existing base model ones.
— Base Models Know How to Reason, Thinking Models Learn When
(2510.07364 - Venhoff et al., 8 Oct 2025) in Abstract (p. 1)