On the convergence of optimistic policy iteration for stochastic shortest path problem
Abstract: In this paper, we prove some convergence results of a special case of optimistic policy iteration algorithm for stochastic shortest path problem. We consider both Monte Carlo and $TD(\lambda)$ methods for the policy evaluation step under the condition that the termination state will eventually be reached almost surely.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.