Papers
Topics
Authors
Recent
Search
2000 character limit reached

A study of Thompson Sampling with Parameter h

Published 5 Oct 2017 in cs.LG, cs.IT, and math.IT | (1710.02174v1)

Abstract: Thompson Sampling algorithm is a well known Bayesian algorithm for solving stochastic multi-armed bandit. At each time step the algorithm chooses each arm with probability proportional to it being the current best arm. We modify the strategy by introducing a paramter h which alters the importance of the probability of an arm being the current best arm. We show that the optimality of Thompson sampling is robust to this perturbation within a range of parameter values for two arm bandits.

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (1)

Collections

Sign up for free to add this paper to one or more collections.