Efficient optimization of asymptotically optimal codes for Transformers
Develop efficient optimization methods for asymptotically optimal description length objectives for Transformer encoders, including the asymptotically optimal two-part codes and the adaptive variational codes parameterized by Gaussian mixture priors, so that these objectives can be minimized effectively in practice under finite computational resources.
Sponsor
References
A family of codes that is asymptotically optimal represents a theoretical ideal, and while we have shown that practical instances of such codes exist, we have yet to show that they can be efficiently optimized.
— Bridging Kolmogorov Complexity and Deep Learning: Asymptotically Optimal Description Length Objectives for Transformers
(2509.22445 - Shaw et al., 26 Sep 2025) in Appendix, Section "Asymptotically Quasi-Optimal Families of Codes" (first paragraph)