Closing the gap between myopic lower bounds and polynomial particle requirements
Ascertain and close the gap between (i) the lower bound that any myopic particle filtering algorithm requires at least \Omega(log H / log log H) particles to achieve non-trivial sampling accuracy under constant-factor inaccuracies in the process reward model and (ii) the current polynomial-in-H particle requirements of Sequential Monte Carlo and its variants; determine the optimal scaling in H, possibly by allowing lookahead beyond myopic methods.
References
In \cref{thm:pf-lb-main} we show that fully avoiding this lower bound would require lookahead: any myopic method needs at least $\Omega(\log H/\log \log H)$ particles (\cref{thm:pf-lb-main}). Closing this gap is open.
— Reject, Resample, Repeat: Understanding Parallel Reasoning in Language Model Inference
(2603.07887 - Golowich et al., 9 Mar 2026) in Section 1, Subsection "Theoretical Contributions: A Principled Analysis of Particle Filtering Methods" (Contribution III: Limits of particle filtering)