2000 character limit reached
When do discounted-optimal policies also optimize the gain? (2304.08048v1)
Published 17 Apr 2023 in eess.SY, cs.SY, and math.OC
Abstract: In this technical note, we establish an upper-bound on the threshold on the discount factor starting from which all discounted-optimal deterministic policies are gain-optimal, that we prove to be tight on an example. To address computability issues of that theoretical threshold, we provide a weaker bound which is tractable on ergodic MDPs in polynomial time.