Understanding when additional singular values improve performance
Determine why retaining additional singular values in the truncated SVD of the per-sample loss Jacobian in Sven improves optimization in some tasks but not others, characterizing the task- and landscape-dependent conditions under which larger k is beneficial versus detrimental.
References
It is not immediately clear why additional singular values are beneficial in some cases and not others, but it is likely related to the overall loss landscape of the problem.
— Sven: Singular Value Descent as a Computationally Efficient Natural Gradient Method
(2604.01279 - Bright-Thonney et al., 1 Apr 2026) in Section 4 (Experiments), Regression Results