Benefits of stricter normalization for large-scale models (especially LLMs trained with RL)
Determine whether applying stricter normalization schemes—such as enforcing unit-norm constraints via hyperspherical (ℓ2) normalization—provides benefits for large-scale models, particularly large language models trained with reinforcement learning-based objectives, and establish in what contexts such normalization is advantageous relative to conventional normalization techniques.
References
Furthermore, with increasing interest in RL for training LLMs, the potential benefits of using stricter normalization for large models remain an exciting open question.
— Hyperspherical Normalization for Scalable Deep Reinforcement Learning
(2502.15280 - Lee et al., 21 Feb 2025) in Section 6: Lessons and Opportunities