Further Scaling of Deep RL Architectures
Determine whether scaling deep reinforcement learning neural network architectures to larger depths and widths beyond those evaluated in this study can maintain stable training and strong performance when using multi-skip residual connections, Layer Normalization, and Kronecker-factored optimization, and characterize the limits of such scaling under practical computational constraints.
References
Our study is constrained by computational resources, which limited our ability to explore architectures beyond a certain size. While our interventions show consistent improvements across agents and environments, further scaling remains an open question.
— Stable Gradients for Stable Learning at Scale in Deep Reinforcement Learning
(2506.15544 - Castanyer et al., 18 Jun 2025) in Section 7 (Discussion), Limitations