A Note on the Equivalence of Upper Confidence Bounds and Gittins Indices for Patient Agents
Abstract: This note gives a short, self-contained, proof of a sharp connection between Gittins indices and Bayesian upper confidence bound algorithms. I consider a Gaussian multi-armed bandit problem with discount factor $\gamma$. The Gittins index of an arm is shown to equal the $\gamma$-quantile of the posterior distribution of the arm's mean plus an error term that vanishes as $\gamma\to 1$. In this sense, for sufficiently patient agents, a Gittins index measures the highest plausible mean-reward of an arm in a manner equivalent to an upper confidence bound.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.