Extrapolating recurrence depth at test time

Develop training and architectural methods for depth-recurrent transformer language models that enable reliable extrapolation to greater recurrence depths at test time, allowing the models to solve problems that are harder than those encountered during training while maintaining stability and performance.

Background

The paper proposes converting pretrained non-recurrent LLMs into depth-recurrent models via continued pretraining and shows benefits for math reasoning under fixed training compute. Depth-recurrence allows increasing test-time compute by iterating a recurrent block more times without increasing parameter count.

While retrofitting recurrence and scheduling strategies improve training efficiency and test-time gains, the authors highlight that building depth-recurrent models which can effectively recur deeper at inference than during training remains unresolved. This capability would enable solving harder problems by scaling internal latent computation beyond training settings.

References

One unsolved problem is how to most effectively build depth-recurrent models that can recur deeper at test time to solve harder problems than were seen during training.

— Teaching Pretrained Language Models to Think Deeper with Retrofitted Recurrence (2511.07384 - McLeish et al., 10 Nov 2025) in Discussion, Section 5

Extrapolating recurrence depth at test time

Sponsor

Background

References

Related Problems