Extension beyond Gaussian supervised inputs to natural self-supervised settings

Extend the high-dimensional analysis of empirical risk minimization for single-head tied attention from Gaussian input embeddings with supervised targets to natural structured inputs learned in a self-supervised manner, determining whether the derived generalization and spectral predictions continue to hold.

Background

The current analysis assumes Gaussian inputs and a supervised, attention-indexed target, which enable tractable characterization of generalization and spectral properties.

The authors explicitly state that extending their framework to natural data and self-supervised learning remains open, which would test the robustness and applicability of their predictions to realistic training regimes.

References

Second, we assume Gaussian input embeddings with structured input-output relationship under supervised learning, leaving open the extension to natural structured inputs from which one learns in self-supervised manner.

— Inductive Bias and Spectral Properties of Single-Head Attention in High Dimensions (2509.24914 - Boncoraglio et al., 29 Sep 2025) in Section 6, Conclusion and limitations

Extension beyond Gaussian supervised inputs to natural self-supervised settings

Background

References

Related Problems