Explain why recursion yields superior generalization over scaling depth or size
Establish a theoretical explanation for why recursive reasoning with deep supervision, as implemented in the Tiny Recursion Model (TRM), provides markedly better generalization than increasing model size or depth in conventional supervised architectures, and ascertain whether mitigation of overfitting is the primary mechanism responsible for the observed gains.
References
Although we simplified and improved on deep recursion, the question of why recursion helps so much compared to using a larger and deeper network remains to be explained; we suspect it has to do with overfitting, but we have no theory to back this explaination.
— Less is More: Recursive Reasoning with Tiny Networks
(2510.04871 - Jolicoeur-Martineau, 6 Oct 2025) in Conclusion