Impact of rubric-judge scoring scales on training
Evaluate alternative scoring scales for the rubric-based reward judge used in RLER, and determine how different scales affect reinforcement learning stability and downstream performance compared to the current 0–2 scale normalization.
References
We leave exploring different scoring scales to future work.
— DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research
(2511.19399 - Shao et al., 24 Nov 2025) in Appendix, Subsection "Rubric Reward Judge Prompt"