Can LLMs acquire genuinely new reasoning strategies beyond pre/post-training?
Determine whether large language models can acquire or generalize genuinely new reasoning strategies beyond the sharpened skills encoded during pre-training or post-training.
References
It remains an open question whether LLMs can acquire or generalize genuinely new reasoning strategies, beyond the sharpened skills encoded in their parameters during pre-training or post-training.
— DELTA-Code: How Does RL Unlock and Transfer New Programming Algorithms in LLMs?
(2509.21016 - Sun et al., 25 Sep 2025) in Abstract (page 1)