Minimal conditions for RL success
Determine the minimal sufficient conditions under which reinforcement learning (post-training) will succeed, beyond the coverage assumptions. Precisely characterize the weakest algorithmic or distributional requirements needed for RL methods to achieve high downstream reward when coverage may be limited or absent.
References
While there is ample evidence current RL techniques can fail in the absence of coverage \citep{yue2025does,gandhi2025cognitive,wu2025invisible}, it is not clear what the minimal conditions required for RL are.
— The Coverage Principle: How Pre-training Enables Post-Training
(Chen et al., 16 Oct 2025) in Section 6.1, Simplifications in the Problem Formulation