Dice Question Streamline Icon: https://streamlinehq.com

Impact of recurrent depth on transformer performance

Determine how recurrent depth affects the performance of transformer-based language models by rigorously characterizing the relationship between the number of recurrent iterations and model performance across tasks and conditions.

Information Square Streamline Icon: https://streamlinehq.com

Background

The work situates depth-recurrence within broader research on depth in transformers and latent reasoning. Although many studies examine depth effects in transformers, a clear understanding of how recurrent depth specifically influences performance is not established.

Clarifying the performance implications of recurrence depth would inform architectures and training strategies for models that exploit test-time compute by looping internal layers.

References

Many works study the impact of depth for transformers both theoretically and practically \citep{levine2020depth,merrill2022saturated,mcleish2025gemstones,zuo2025falcon,merrill2025little,csordas2025language}, it is still an open question how recurrent depth impacts the performance of transformers.

Teaching Pretrained Language Models to Think Deeper with Retrofitted Recurrence (2511.07384 - McLeish et al., 10 Nov 2025) in Appendix A, Extended Related Works