Runtime Support for Inference-Time Search in Online Agent Execution

Design agent runtime environments and orchestration mechanisms that support inference-time search techniques (such as generate-verify-search), and develop methods to use these techniques effectively in online execution settings without retraining models.

Background

The paper observes that deployed agents currently favor simple, controllable methods and low call budgets, while advanced inference-time scaling and search-based reasoning (e.g., generate-verify-search) show promise for stronger reliability and controllability.

The authors argue that realizing these techniques in production requires infrastructural support (e.g., simulators and verifiers) and explicitly state that adapting runtimes to support inference-time search and using them effectively during online execution are open research questions.

References

How to adapt agent runtime environments to support such inference-time search technique, and how to effectively utilize them in an online execution settings, are open research questions.

Measuring Agents in Production (2512.04123 - Pan et al., 2 Dec 2025) in Discussion, Subsection: Open Research Questions