Characterize covariance structure favoring GN^{-1/2} over GN^{-1} (via condition number ratio)
Characterize the set of covariance matrices Cov_x for which r(Cov_x) > 1, where r(Cov_x) := cond(Cov_x^{1/2} diag(Cov_x^{-1}) Cov_x^{1/2}) / cond(Cov_x^{1/2} diag(Cov_x^{-1/2}) Cov_x^{1/2}). This will identify when, under the identity basis for diagonal preconditioning, using the Gauss–Newton diagonal with power −1/2 yields a more favorable preconditioned-Hessian condition number than using power −1.
Sponsor
References
Characterizing covariance matrices for which $r(\Cov_x) > 1$ is left as future work.
— Adam or Gauss-Newton? A Comparative Study In Terms of Basis Alignment and SGD Noise
(2510.13680 - Liu et al., 15 Oct 2025) in Appendix A.3 (Comparing GN powers)