2000 character limit reached
Generalised Entropy MDPs and Minimax Regret (1412.3276v1)
Published 10 Dec 2014 in cs.LG and stat.ML
Abstract: Bayesian methods suffer from the problem of how to specify prior beliefs. One interesting idea is to consider worst-case priors. This requires solving a stochastic zero-sum game. In this paper, we extend well-known results from bandit theory in order to discover minimax-Bayes policies and discuss when they are practical.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.