Dice Question Streamline Icon: https://streamlinehq.com

Reconciling the proper bases for spectral normalization and variance adaptation

Investigate how to reconcile the choice of basis in which spectral normalization and variance adaptation are applied within matrix-whitening optimizers, specifically whether variance adaptation should be performed in the rotated eigenbasis (as in SOAP) or in the original elementwise basis after orthogonalization (as in AdaMuon).

Information Square Streamline Icon: https://streamlinehq.com

Background

The paper shows that variance adaptation provides benefits regardless of the basis in which it is applied: in SOAP, adaptation occurs in the rotated eigenbasis, while in AdaMuon it is applied elementwise after Newton–Schulz orthogonalization. This suggests matrix-whitening serves two roles—spectral normalization and variance adaptation—that can be decoupled.

The authors present preliminary evidence (including a didactic SPA variant) but note that properly reconciling the basis choices for these two components remains unresolved and requires further investigation.

References

We leave further examination on how to reconcile the proper bases for spectral-normalization and variance-adaptation to future investigation.

What Really Matters in Matrix-Whitening Optimizers? (2510.25000 - Frans et al., 28 Oct 2025) in Subsection “Why does variance adaptation still work when done after orthogonalization?”