Interaction of Positional Embeddings, Local Attention, and Geometric Formation

Characterize the interaction among positional embeddings, local attention kernels (including sliding-window attention), and the formation of the low-dimensional geometric substrate (e.g., entropy-ordered value manifolds) in transformer language models, with particular attention to sliding-window and hybrid transformer–state-space-model architectures.

Background

Beyond identifying value manifolds and key orthogonality, the paper highlights that how architectural components influence the emergence of geometric structure is unresolved. In particular, positional embeddings and local attention kernels likely interact with geometric formation, and this interaction appears to affect dynamic attention focusing in architectures with constrained routing (e.g., sliding-window attention).

Understanding this interaction is important for explaining architecture-dependent differences in dynamic Bayesian signatures and for designing models or diagnostics that preserve geometric clarity while meeting efficiency constraints.

References

Likewise, the interaction between positional embeddings, local attention kernels, and geometric formation remains an open problem, especially in sliding-window or hybrid transformer----SSM architectures.

Geometric Scaling of Bayesian Inference in LLMs  (2512.23752 - Aggarwal et al., 27 Dec 2025) in Analysis and Key Findings — Robustness and Limitations — Open representational questions