Effective test-time scaling for large language model reasoning
Determine principled and effective strategies for test-time scaling—i.e., increasing inference-time compute during generation—to improve reasoning performance in large language models across tasks.
References
However, the challenge of effective test-time scaling remains an open question for the research community.
                — DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
                
                (2501.12948 - DeepSeek-AI et al., 22 Jan 2025) in Section 1, Introduction