Relationship between reasoning trajectory length and increased inference-time compute
Determine the functional relationship between the length of chain-of-thought reasoning trajectories and the effects of increased inference-time computational capacity in large language models, including whether longer chains consistently improve performance or can instead lead to degradation, and characterize the conditions under which each outcome occurs.
References
For example, the relationship between the length of reasoning trajectories and the subsequent increased inference-time computational capacity remains unclear; while some works find clear gains (Muennighoff et al., 2025; Li et al., 2025), other work reports that shorter chains can be more effective and that continuing to extend reasoning (e.g., via "wait" tokens) can yield degradation in performance (Wu et al., 2025; Marjanović et al., 2025).