Optimal Newton–Schulz polynomial coefficients for AOL-preconditioned Turbo-Muon
Develop optimal iteration-dependent polynomial coefficients for the quintic Newton–Schulz iteration when preceded by Almost Orthogonal Layer (AOL) preconditioning in Turbo-Muon, tailored to a fixed iteration budget, and assess whether these coefficients improve polar approximation performance relative to coefficients designed for the non-preconditioned case.
Sponsor
References
Therefore, we leave the computation of optimal coefficients for Turbo-Muon as potential future work.
— Turbo-Muon: Accelerating Orthogonality-Based Optimization with Pre-Conditioning
(2512.04632 - Boissin et al., 4 Dec 2025) in Appendix, Section “About the tuning of Newton-Schulz coefficients”, subsection “Ablation”