Transferability of GlobalRAG to Very Large Language Models
Determine whether GlobalRAG, a reinforcement learning framework for multi-hop question answering that integrates planning-aware rewards and progressive weight annealing, can be effectively transferred to very large-scale language models such as DeepSeek-R1 under reinforcement learning training.
References
First, due to computational and cost constraints, we are unable to conduct RL training on very large-scale models (e.g., DeepSeek-R1). Whether our approach can effectively transfer to such models remains an open question.
— GlobalRAG: Enhancing Global Reasoning in Multi-hop Question Answering via Reinforcement Learning
(2510.20548 - Luo et al., 23 Oct 2025) in Section: Limitations