Papers
Topics
Authors
Recent
Search
2000 character limit reached

A Greedy Approximation of Bayesian Reinforcement Learning with Probably Optimistic Transition Model

Published 13 Mar 2013 in cs.AI, cs.LG, and stat.ML | (1303.3163v3)

Abstract: Bayesian Reinforcement Learning (RL) is capable of not only incorporating domain knowledge, but also solving the exploration-exploitation dilemma in a natural way. As Bayesian RL is intractable except for special cases, previous work has proposed several approximation methods. However, these methods are usually too sensitive to parameter values, and finding an acceptable parameter setting is practically impossible in many applications. In this paper, we propose a new algorithm that greedily approximates Bayesian RL to achieve robustness in parameter space. We show that for a desired learning behavior, our proposed algorithm has a polynomial sample complexity that is lower than those of existing algorithms. We also demonstrate that the proposed algorithm naturally outperforms other existing algorithms when the prior distributions are not significantly misleading. On the other hand, the proposed algorithm cannot handle greatly misspecified priors as well as the other algorithms can. This is a natural consequence of the fact that the proposed algorithm is greedier than the other algorithms. Accordingly, we discuss a way to select an appropriate algorithm for different tasks based on the algorithms' greediness. We also introduce a new way of simplifying Bayesian planning, based on which future work would be able to derive new algorithms.

Citations (5)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.