Behavior at the critical “pentuple point” α = β = 1/2
Characterize the behavior of the loss curve and compute-optimal frontier at the critical point α = β = 1/2 in the (α, β)-phase diagram for the PLRF model trained with one-pass SGD, where all forcing and kernel components (Fpp, Fac, F0, Kpp) mix and interact.
References
Moreover, there exists an interesting critical point \alpha = \beta = \tfrac{1}{2} where all the parts of the forcing function and kernel mix and interact with each other. The behavior of the loss at the pentuple point (see Fig~\ref{fig:phase_diagram}) we leave for future research.
— 4+3 Phases of Compute-Optimal Neural Scaling Laws
(2405.15074 - Paquette et al., 23 May 2024) in Section “The 4 Phases”