Mechanism behind context-induced reasoning shift in LLMs

Investigate the mechanism by which different prompt context conditions—specifically long irrelevant prefixes (Long input setup), multiple independent problems within a single prompt (Subtask setup), and multi-turn chat histories (Multi-turn setup)—cause large language models operating in thinking mode to generate shorter Chain-of-Thought traces and to reduce self-verification and uncertainty-management behaviors compared to solving the same problems in isolation (Baseline setup).

Background

The paper reports a robust phenomenon across multiple reasoning LLMs: when the same problems are presented under non-baseline context conditions (e.g., long irrelevant text, multiple subtasks, or multi-turn histories), models produce significantly shorter reasoning traces—often up to 50% fewer tokens—and show reduced self-verification and uncertainty-management behaviors.

While the empirical evidence suggests that non-relevant context suppresses high-level reasoning patterns, the causal mechanism behind this behavioral shift is not established. The authors explicitly defer a deeper mechanistic analysis, highlighting this as an unresolved question for future research.

References

We leave a deeper analysis of the mechanism behind this shift for future work.

Reasoning Shift: How Context Silently Shortens LLM Reasoning  (2604.01161 - Rodionov, 1 Apr 2026) in Section 4 (Analysis), end of section