2000 character limit reached
Exponential Lower Bounds For Policy Iteration (1003.3418v1)
Published 17 Mar 2010 in cs.DS
Abstract: We study policy iteration for infinite-horizon Markov decision processes. It has recently been shown policy iteration style algorithms have exponential lower bounds in a two player game setting. We extend these lower bounds to Markov decision processes with the total reward and average-reward optimality criteria.