Response-length versus tool-call efficiency trade-off
Characterize the quantitative trade-off between response length (number of inference tokens) and tool-call efficiency (number and accuracy of tool invocations) in multi-turn agentic reasoning with external tools.
References
Open puzzles are unsolved regarding the allocation of turn budgets, the trade-off between response length and tool-call efficiency, and the impact of long-CoT predispositions on multi-turn reasoning.
— Demystifying Reinforcement Learning in Agentic Reasoning
(2510.11701 - Yu et al., 13 Oct 2025) in Introduction, Reasoning Mode-wise paragraph (#1{3})