Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 154 tok/s
Gemini 2.5 Pro 48 tok/s Pro
GPT-5 Medium 38 tok/s Pro
GPT-5 High 36 tok/s Pro
GPT-4o 109 tok/s Pro
Kimi K2 206 tok/s Pro
GPT OSS 120B 434 tok/s Pro
Claude Sonnet 4.5 39 tok/s Pro
2000 character limit reached

The Multi-Armed Bandit Problem: An Efficient Non-Parametric Solution (1703.08285v4)

Published 24 Mar 2017 in math.ST and stat.TH

Abstract: Lai and Robbins (1985) and Lai (1987) provided efficient parametric solutions to the multi-armed bandit problem, showing that arm allocation via upper confidence bounds (UCB) achieves minimum regret. These bounds are constructed from the Kullback-Leibler information of the reward distributions, estimated from specified parametric families. In recent years there has been renewed interest in the multi-armed bandit problem due to new applications in machine learning algorithms and data analytics. Non-parametric arm allocation procedures like $\epsilon$-greedy, Boltzmann exploration and BESA were studied, and modified versions of the UCB procedure were also analyzed under non-parametric settings. However unlike UCB these non-parametric procedures are not efficient under general parametric settings. In this paper we propose efficient non-parametric procedures.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (1)

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.