Cause of DeepSeek‑V3’s elevated token usage in natural‑language proof planning
Determine the underlying cause of DeepSeek‑V3’s markedly higher token usage in the natural‑language proof planning setting compared to its token usage in the symbolic graph connectivity setting, under the paper’s evaluation conditions, to explain why the model emits substantially more tokens for natural‑language tasks than for symbolic graphs.
References
The reason for V3's markedly higher token usage in the natural-language setting, relative to its behavior for symbolic graphs (Figure~\ref{fig:token_usage_graph}), remains unclear.
— Reasoning Models Reason Well, Until They Don't
(2510.22371 - Rameshkumar et al., 25 Oct 2025) in Section 4.2 (Proof Planning in Deductive Reasoning)