Response-length versus tool-call efficiency trade-off

Characterize the quantitative trade-off between response length (number of inference tokens) and tool-call efficiency (number and accuracy of tool invocations) in multi-turn agentic reasoning with external tools.

Background

Frequent tool calls may not always improve outcomes; the authors observe that fewer, deliberate calls can yield better performance, but the underlying trade-off with response length remains unresolved.

Understanding this trade-off is essential for designing reasoning modes that maximize accuracy and tool efficiency without inducing inefficient long-chain reasoning.

References

Open puzzles are unsolved regarding the allocation of turn budgets, the trade-off between response length and tool-call efficiency, and the impact of long-CoT predispositions on multi-turn reasoning.

— Demystifying Reinforcement Learning in Agentic Reasoning (2510.11701 - Yu et al., 13 Oct 2025) in Introduction, Reasoning Mode-wise paragraph (#1{3})

Response-length versus tool-call efficiency trade-off

Background

References

Related Problems