Next stage of scaling RL for LLMs
Determine the next stage of scaling reinforcement learning for large language models, specifically assessing whether open-ended reinforcement learning constitutes a viable and effective direction for continued capability growth.
Sponsor
References
The next stage of scaling RL for LLMs remains an open question, with open-ended RL presenting a particularly challenging and promising direction.
— A Survey of Reinforcement Learning for Large Reasoning Models
(2509.08827 - Zhang et al., 10 Sep 2025) in Figure rl_evol caption