Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
143 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

DAST: Difficulty-Adaptive Slow-Thinking for Large Reasoning Models (2503.04472v2)

Published 6 Mar 2025 in cs.LG and cs.AI

Abstract: Recent advancements in slow thinking reasoning models have shown exceptional performance in complex reasoning tasks. However, these models often exhibit overthinking (generating redundant reasoning steps for simple problems), leading to excessive computational resource usage. While current mitigation strategies uniformly reduce reasoning tokens, they risk degrading performance on challenging tasks that require extended reasoning. This paper introduces Difficulty-Adaptive Slow Thinking (DAST), a novel framework that enables models to autonomously adjust the length of Chain-of-Thought (CoT) based on problem difficulty. We first propose a Token Length Budget (TLB) metric to quantify difficulty, then leverage budget-aware reward shaping and budget preference optimization to implement DAST. DAST penalizes overlong responses for simple tasks while incentivizing sufficient reasoning for complex problems. Experiments on diverse datasets and model scales demonstrate that DAST effectively mitigates overthinking (reducing token usage by over 30\% on average) while preserving reasoning accuracy on complex problems. Our codes and models are available at https://github.com/AnonymousUser0520/AnonymousRepo01.

Summary

We haven't generated a summary for this paper yet.