Adapting the dynamic batch size controller to diverse latency profiles

Determine how to adjust Combee’s dynamic batch size controller for workloads with substantially different latency profiles, including whether the power-law delay model and fixed marginal-reduction threshold τ remain effective and how they should be tuned.

Background

The dynamic batch size controller in Combee fits a power-law delay model and selects batch sizes based on a fixed marginal-reduction threshold to balance quality and speed. It performed well in the reported experiments across several benchmarks and models.

However, latency characteristics vary across tasks and deployment environments. The authors highlight that the controller may need adaptation when latency profiles differ significantly, raising the open question of its generality and tuning strategy.

References

While Combee demonstrates consistent improvements across our evaluation settings, several aspects remain open for future work. Second, the dynamic batch size controller relies on a power-law delay model with a fixed marginal-reduction threshold ($\tau$), which performed well in our settings but may require adjustment for workloads with substantially different latency profiles.

Combee: Scaling Prompt Learning for Self-Improving Language Model Agents  (2604.04247 - Li et al., 5 Apr 2026) in Section: Limitations and Future Work