Empirical Fisher spectrum structure in deep neural networks
Prove the conjecture that the empirical Fisher information matrix of deep neural networks has a spectrum characterized by a bulk of eigenvalues concentrated near zero together with a small number of extremely large eigenvalues.
References
The main interest in the spectrum is to prove a long-standing conjecture about the structure of the empirical Fisher information: most of its eigenvalues are bulked together near zero while there are a few extremely large ones, which are known to cause issues in optimization.
— Non-identifiability distinguishes Neural Networks among Parametric Models
(2504.18017 - Chatterjee et al., 25 Apr 2025) in Discussion (Section 4)