Representational Curvature Modulates Behavioral Uncertainty in Large Language Models

Published 27 Apr 2026 in cs.AI, cs.CL, and cs.LG | (2604.23985v1)

Abstract: In autoregressive LLMs, temporal straightening offers an account of how the next-token prediction objective shapes representations. Models learn to progressively straighten the representational trajectory of input sequences across layers, potentially facilitating next-token prediction via linear extrapolation. However, a direct link between this trajectory and token-level behavior has been missing. We provide such a link by relating contextual curvature-a geometric measure of how sharply the representational trajectory bends over recent context-to next-token entropy. Across two models (GPT-2 XL and Pythia-2.8B), contextual curvature is correlated with entropy, and this relationship emerges during training. Perturbation experiments reveal selective dependence: manipulating curvature through trajectory-aligned interventions reliably modulates entropy, while geometrically misaligned perturbations have no effect. Finally, regularizing representations to be straighter during training modestly reduces token-level entropy without degrading validation loss. These results identify trajectory curvature as a task-aligned representational feature that influences behavioral uncertainty in LLMs.

Abstract PDF Upgrade to Chat

Authors (3)

Summary

The paper demonstrates that trajectory curvature in LLMs modulates next-token entropy, with strongest effects observed in middle transformer layers.
It employs geometric measures and targeted perturbation experiments to show that trajectory-aligned interventions selectively control model uncertainty.
Curvature regularization effectively adjusts token-level entropy without impacting predictive performance, offering new avenues for uncertainty calibration.

Representational Curvature as a Modulator of Uncertainty in LLMs

Introduction

This paper addresses the functional impact of internal representational geometry—specifically, trajectory curvature—on behavioral uncertainty in autoregressive LLMs. Motivated by principles from computational neuroscience and mechanistic interpretability, the authors hypothesize that temporal straightening in hidden-state trajectories serves to align representations for effective next-token prediction. Adopting scalar geometric measures, they explicitly link the curvature of token-wise trajectories in the residual stream to next-token output entropy, quantifying the relationship at layer- and training-dynamics resolution. The analysis spans multiple open-weight models and controlled training experiments, incorporating both perturbative and regularization-based interventions.

Contextual Curvature and Entropy: Empirical Correlation

The central result establishes that contextual curvature, defined as the mean angular deviation over sequential difference vectors in the token activation manifold, predicts next-token entropy most strongly in the middle layers of deep transformer stacks. Both GPT-2 XL and Pythia-2.8B exhibit monotonic curvature reduction from early to middle layers, with a minimum coinciding with layers of maximal predictive correlation ( $r \approx 0.15$ ) between curvature and entropy.

Figure 1: Average contextual curvature and its predictive correlation with next-token entropy across transformer layers; minima and maxima co-localize in middle layers for both GPT-2 XL and Pythia-2.8B.

Alternative scalar geometric features, including activation magnitude and trajectory distance, are less predictive and exhibit weaker layer localization. Curvature's explanatory power remains significant when controlling for unigram probability, ruling out confounding effects from token frequency priors. The window-size sweep confirms robustness across local context definitions, while layer-specific analysis highlights that curvature-entropy coupling is highly localized to the representational straightening regime.

Training Dynamics and the Emergence of Curvature-Entropy Coupling

Analysis of Pythia-2.8B across checkpoints reveals that both trajectory straightening and the associated increase in curvature-entropy predictivity are training-dependent phenomena. Early-stage checkpoints show uniformly high curvature and low coupling to entropy; the emergence of middle-layer straightening and rising correlation occurs at approximately 0.7% of the full training corpus.

Figure 2: Training checkpoint analysis, showing progressive straightening of trajectories and concurrent emergence of curvature-entropy coupling; predictivity peaks align with minimal curvature in final checkpoints.

Notably, the predictive power of curvature for entropy rapidly increases as representational geometry is reshaped, confirming that internal straightening is functionally aligned with the demands of next-token prediction.

Perturbation Experiments: Trajectory-Aligned Modulation of Output Uncertainty

Perturbation experiments differentiate the selective behavioral relevance of curvature. Additive perturbations in the residual stream, scaled relative to intrinsic trajectory step size, are applied across five geometrically distinct families, spanning full-space to planar-subspaces tightly aligned with the representational trajectory. Only trajectory-aligned perturbations (trajectory-subspace and planar-subspace) reliably induce directional changes in output entropy, creating robust correlation between $\Delta C$ (curvature change) and $\Delta H$ (entropy change).

Figure 3: Perturbation schematic and $\Delta C$ – $\Delta H$ correlation across subspaces; trajectory-aligned and planar-subspace perturbations exhibit selective coupling, while agnostic directions show negligible behavioral impact.

This result further substantiates the hypothesis that LLMs' internal sequence geometry is not just a byproduct but a functional substrate for uncertainty modulation.

Curvature Regularization: Direct Behavioral Control Without Loss Degradation

Curvature regularization is implemented as an auxiliary loss penalizing mean trajectory curvature in GPT-2 Small, applied during pretraining at layers with maximal curvature-entropy coupling. The untangled (straighter) variant yields reduced token-level entropy, while the tangled (higher-curvature) variant increases entropy, without significant increase in validation loss compared to baseline. These effects generalize across multiple datasets.

Figure 4: Training and validation outcomes for curvature-regularized models; untangled and tangled variants respectively reduce and increase token-level entropy, with similar convergence and loss profiles.

This demonstrates that geometric regularization can control behavioral uncertainty directly, independent of degradation in predictive performance for the next-token task.

Theoretical and Practical Implications

The authors integrate their findings with temporal straightening and geodesic hypotheses from computational neuroscience and previous mechanistic interpretability literature. Representational straightening in intermediate layers facilitates linear extrapolation and reduces uncertainty via output entropy compression. The functional localization to middle layers dovetails with other structural investigations (e.g., monosemanticity, belief dynamics, feature polysemanticity), suggesting that these layers anchor both interpretability and behavioral control. Perturbation and regularization analyses provide a toolkit for steering uncertainty, complementing activation engineering and sparse autoencoder-based control techniques.

The implications for future work are multifold: geometric regularization, auxiliary losses, and trajectory-based interpretability open doors for uncertainty calibration, robust prediction, and explicit behavioral steering in large-scale LMs. Extension to foundation models of greater scale, other architectures, and multi-modal inputs remains open; quantifying semantic structure beyond output entropy is a key challenge.

Conclusion

This paper rigorously traces the relationship between trajectory curvature in the residual stream and token-level uncertainty in autoregressive LLMs. Curvature emerges as a behaviorally relevant geometric feature during training, is selectively coupled to uncertainty via trajectory-aligned interventions, and is amenable to direct behavioral control through geometric regularization. These results establish contextual curvature as a functional bridge between internal representational geometry and external predictive uncertainty, framing it as a principled axis for both interpretation and intervention in LLMs.

Markdown Report Issue