Generalization of candidate belief geometries across models and layers
Determine whether the candidate simplex-structured representations and associated barycentric predictive advantages identified in the residual stream at layer 20 of Gemma-2-9B generalize to other language model architectures and to other layers.
References
All real-model results are from Gemma-2-9B, layer 20. Whether findings generalize to other models, architectures, or layers is unknown.
— Finding Belief Geometries with Sparse Autoencoders
(2604.02685 - Levinson, 3 Apr 2026) in Subsection "Limitations" (Discussion)