Formalize the hypothesized LayerNorm-induced ellipsoidal deformation of activation space

Formalize the hypothesis that the affine components of the last LayerNorm in DINOv2 deform the activation space into an ellipsoidal geometry that locally increases curvature, thereby requiring more dictionary atoms to adequately cover the manifold for low reconstruction error.

Background

While analyzing the dictionary’s singular-value spectrum and anisotropy, the authors speculate that affine components of the final LayerNorm may deform the activation space in a way that influences dictionary learning, potentially explaining the observed need for more atoms along certain directions.

This is presented as a speculative explanation for anisotropy and sharp spectral decay; the authors explicitly note that this account has not yet been formalized.

References

A speculative hypothesis is that the affine components of the last LayerNorm deforms the activation space into an ellipsoidal geometry, locally increasing curvature along some directions. In such regions, dictionary learning may require more atoms to adequately cover the manifold with low reconstruction error. While this remains to be formalized, similar intuitions appear in manifold tiling and sparse coding contexts.

— Into the Rabbit Hull: From Task-Relevant Concepts in DINO to Minkowski Geometry (2510.08638 - Fel et al., 8 Oct 2025) in Section: Statistics and Geometry of Concepts, Geometric Organization (footnote to Global Anisotropy discussion)

Formalize the hypothesized LayerNorm-induced ellipsoidal deformation of activation space

Background

References

Related Problems