Ultimate extrapolation boundaries of Fitting and Transfer paradigms
Ascertain the maximum scale at which learning-rate predictions derived from the Fitting paradigm and from μTransfer-based hyperparameter transfer remain accurate in large-scale pre-training, thereby identifying the ultimate extrapolation boundaries for these approaches.
Sponsor
References
Due to computational resource constraints, this study did not investigate the ultimate extrapolation boundaries (i.e., the maximum scale at which these predictions remain accurate) for both the Fitting and Transfer paradigms.
— How to Set the Learning Rate for Large-Scale Pre-training?
(2601.05049 - Zhou et al., 8 Jan 2026) in Section: Limitations