Parameter-commitment timing for tool calls in real-time voice agents

Determine a principled policy for when a real-time voice agent should commit parameters for API tool calls during streaming interaction so as to optimally trade off latency against the ability to incorporate mid-utterance self-corrections, thereby avoiding stale or incorrect actions while maintaining natural conversational speed.

Background

The paper presents two hard scenarios showing a tension between speed and correctness in streaming tool use. In a clean multi-step chain, aggressive early commitment enables faster completion, exemplified by Gemini Live 3.1. In a scenario with double self-correction (destination and date), the same early commitment locks in outdated parameters, preventing correct state rollback.

From these observations, the authors identify the architectural question of when to commit tool parameters—early for speed or late for correctness—as unresolved. This policy determines whether models can both respond quickly and remain flexible to user changes mid-utterance.

References

Designing when to commit tool parameters—eagerly for speed or conservatively for correctness—remains an open challenge for real-time voice agents.

Full-Duplex-Bench-v3: Benchmarking Tool Use for Full-Duplex Voice Agents Under Real-World Disfluency  (2604.04847 - Lin et al., 6 Apr 2026) in Discussion — Case Studies (end of section)