Origin of hotspot-layer fractional depth differences across architectures
Ascertain whether the early-layer localisation of the introspection direction at approximately 6.25% of total depth in Llama 3.1 (Layers 2 and 5 in 8B and 70B respectively) versus 12.5% in Qwen 2.5-32B reflects architectural differences or genuinely different placement of the self-referential processing mechanism.
References
Whether this reflects architectural differences or a genuinely different placement remains an open question.
— When Models Examine Themselves: Vocabulary-Activation Correspondence in Self-Referential Processing
(2602.11358 - Dadfar, 11 Feb 2026) in Section 6.4 Layer Localisation and the 3.0 Question