Formalize the hypothesized LayerNorm-induced ellipsoidal deformation of activation space
Formalize the hypothesis that the affine components of the last LayerNorm in DINOv2 deform the activation space into an ellipsoidal geometry that locally increases curvature, thereby requiring more dictionary atoms to adequately cover the manifold for low reconstruction error.
References
A speculative hypothesis is that the affine components of the last LayerNorm deforms the activation space into an ellipsoidal geometry, locally increasing curvature along some directions. In such regions, dictionary learning may require more atoms to adequately cover the manifold with low reconstruction error. While this remains to be formalized, similar intuitions appear in manifold tiling and sparse coding contexts.
— Into the Rabbit Hull: From Task-Relevant Concepts in DINO to Minkowski Geometry
(2510.08638 - Fel et al., 8 Oct 2025) in Section: Statistics and Geometry of Concepts, Geometric Organization (footnote to Global Anisotropy discussion)