Papers
Topics
Authors
Recent
2000 character limit reached

Geometric Scaling of Bayesian Inference in LLMs (2512.23752v1)

Published 27 Dec 2025 in cs.LG and cs.AI

Abstract: Recent work has shown that small transformers trained in controlled "wind-tunnel'' settings can implement exact Bayesian inference, and that their training dynamics produce a geometric substrate -- low-dimensional value manifolds and progressively orthogonal keys -- that encodes posterior structure. We investigate whether this geometric signature persists in production-grade LLMs. Across Pythia, Phi-2, Llama-3, and Mistral families, we find that last-layer value representations organize along a single dominant axis whose position strongly correlates with predictive entropy, and that domain-restricted prompts collapse this structure into the same low-dimensional manifolds observed in synthetic settings. To probe the role of this geometry, we perform targeted interventions on the entropy-aligned axis of Pythia-410M during in-context learning. Removing or perturbing this axis selectively disrupts the local uncertainty geometry, whereas matched random-axis interventions leave it intact. However, these single-layer manipulations do not produce proportionally specific degradation in Bayesian-like behavior, indicating that the geometry is a privileged readout of uncertainty rather than a singular computational bottleneck. Taken together, our results show that modern LLMs preserve the geometric substrate that enables Bayesian inference in wind tunnels, and organize their approximate Bayesian updates along this substrate.

Summary

  • The paper demonstrates that LLMs internalize low-dimensional value manifolds whose alignment with predictive entropy reflects Bayesian uncertainty.
  • The paper employs PCA, key orthogonality, and attention entropy metrics across multiple models to validate the link between geometric structures and Bayesian evidence integration.
  • The paper reveals that architectural choices and training data modulate manifold collapse, offering actionable insights for improving uncertainty representation in transformers.

Geometric Scaling of Bayesian Inference in LLMs

Overview

"Geometric Scaling of Bayesian Inference in LLMs" (2512.23752) investigates whether the geometric mechanisms that enable exact Bayesian inference in synthetic "wind-tunnel" settings persist in modern LLMs. The study systematically probes value-manifold structure, key orthogonality, and attention dynamics across representative architectures (Pythia, Phi-2, Llama-3, Mistral), and links these geometric signatures to Bayesian evidence integration both in training and inference. The findings directly address foundational questions in interpretability and uncertainty representation in transformer-based LLMs.

Background: Bayesian Geometry in Transformers

Prior work established that small transformers, when trained on synthetic tasks with analytically tractable posteriors (e.g., bijection learning, HMM inference), form highly structured geometric substrates. Three principal signatures emerge:

  • Value manifolds: The last-layer value vectors align along low-dimensional trajectories parameterized by posterior entropy.
  • Key orthogonality: Key matrices develop near-orthogonal columns representing hypothesis frames.
  • Attention distributions: Attention aligns with posterior predictive distributions, serving as a geometric Bayes rule.

This geometry results from cross-entropy gradients, with value manifolds and key frames supporting inference, and attention focusing sharpening posterior precision. The core question posed is whether these same signatures arise naturally in production-scale LLMs, trained on heterogeneous data and subject to architectural optimizations (GQA, RoPE, sliding-window, MoE).

Experimental Design

The authors analyze four representative LLM families:

  • Pythia (standard MHA, general-purpose training)
  • Phi-2 (standard MHA, curated/high-quality training)
  • Llama-3.2-1B (grouped-query attention, web-trained)
  • Mistral family (GQA, sliding-window, and MoE)

The geometric substrate is interrogated via:

  • PCA of last-layer value representations, under diverse ("mixed-domain") and restricted prompts (e.g., mathematics only)
  • Quantification of key orthogonality against random and initialization baselines
  • Layerwise attention entropy as a focusing metric
  • Direct in-context learning (ICL) experiments with analytically known Bayesian posteriors (SULA task)
  • Causal interventions ablating entropy-aligned manifold directions

Static and Dynamic Geometric Signatures

Domain Restriction and Value Manifold Collapse

A central wind-tunnel prediction is validated: domain restriction should collapse the value manifold toward one dimension, corresponding to posterior entropy. In Llama-3.2-1B, mathematics-only prompts increase the explained variance by the top two PCs from 51.4% to 73.6%, approaching the one-dimensional structure seen in synthetic tasks. Pythia-410M, however, exhibits near-complete manifold collapse (\sim99.7% explained by PC1+PC2) even under mixed prompts, reflecting architectural or training-induced rigidity. Figure 1

Figure 1: Domain restriction induces manifold collapse, especially in Llama-3.2-1B, as evidenced by PCA projections of last-layer value vectors with entropy coloring.

Value Manifolds, Entropy Alignment, and Posterior Evidence

Across models, final-layer value vectors align along axes that strongly correlate with predictive entropy (Spearman ρ=0.14|\rho| = 0.14–$0.59$). During ICL experiments (SULA), model entropy closely tracks analytic Bayesian entropy (MAE 0.31–0.44 bits), and movement along the manifold is monotonic in supplied evidence. Controls demonstrate that this trajectory specifically reflects likelihood integration, not superficial prompt structure.

Orthogonality and Frame Formation

Key projection matrices in all models show 2–10× enhanced orthogonality relative to random baselines, especially in early/mid layers (mean off-diagonal cosine 0.034–0.18). This structured frame supports clean hypothesis discrimination, as predicted by cross-entropy gradient analysis.

Attention Focusing: Architectural Dependence

Layerwise attention entropy decreases most strongly in standard MHA (up to 86% in Phi-2), modestly in GQA (31% in Llama-3.2-1B), and is notably attenuated or non-monotonic in Mistral models (sliding-window/MoE). Thus, dynamic focusing is modulated by routing capacity, but static geometric invariants persist.

Cross-Model Geometric Summary

Figure 2

Figure 3: Normalized comparison of geometric metrics (manifold collapse, domain-restriction gain, key orthogonality, attention focusing) across model families highlights static invariants and dynamic variability.

Causal Probes: Entropy-Axis Ablations

Entropy-aligned manifold directions are ablated in Pythia-410M to test computational necessity. While projection removal destroys the geometry-entropy correlation, Bayesian calibration is minimally affected. This result indicates that the value manifold serves as a privileged readout of uncertainty, not a computational bottleneck; inference representations are distributed across layers and dimensions.

Depth, Training Data, and Representational Richness

Deeper or highly curated models (e.g., Phi-2) exhibit richer and clearer geometric signatures (2D manifolds, sharper orthogonality, stronger focusing). Mixed-domain prompts in deeper models can produce multi-lobed or higher-dimensional value structures, which collapse under domain restriction. Training on curated data enhances geometric clarity and likely supports more robust uncertainty modeling.

Architectural Trade-offs and Scaling Implications

Grouped-query attention architectures (e.g., Llama-3.2-1B) achieve efficiency gains but at the expense of diminished orthogonality and focusing clarity. Mistral-family models further attenuate focusing via sliding-window and MoE routing but retain static manifolds and hypothesis frames. This demonstrates a separation between universal representational substrates and mechanism-specific dynamics, matching the frame–precision dissociation predicted by gradient analyses.

Robustness, Limitations, and Open Directions

Bayesian geometric invariants (value manifolds, hypothesis frames) are robust across architecture, depth, and training regime, while layerwise focusing is sensitive to routing capacity. The emergence of multidimensional manifolds in large/deep models, the causes of architecture-dependent variability, and the precise triggering conditions for manifold collapse require further analysis—particularly at frontier model scales and in more diverse architectural regimes.

Implications and Future Research

This study provides direct evidence that transformers internalize a geometric coordinate system for representing uncertainty, and that evidence integration during inference leverages this substrate. Manifold coordinates offer a scalable, architecture-agnostic axis for uncertainty tracking and may support new diagnostics for interpretability, safety, and reliability estimation. Establishing causal roles and developing interventional methods remain open avenues. The findings motivate architectural choices (favoring standard MHA for interpretability) and curriculum learning strategies (curated-to-diverse data for geometric clarity).

Conclusion

Modern LLMs, regardless of scale or architectural optimization, develop low-dimensional geometric manifolds aligned with Bayesian uncertainty representation and update these representations during inference. However, the geometric substrate functions as a stable representational readout, not a singular computational pathway. The study opens a lens for principled mechanistic interpretability grounded in geometric invariants, with significant implications for architecture design, model analysis, and uncertainty quantification in next-generation AI.

Whiteboard

Paper to Video (Beta)

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 12 tweets with 536 likes about this paper.