Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

An algorithm with nearly optimal pseudo-regret for both stochastic and adversarial bandits (1605.08722v1)

Published 27 May 2016 in cs.LG

Abstract: We present an algorithm that achieves almost optimal pseudo-regret bounds against adversarial and stochastic bandits. Against adversarial bandits the pseudo-regret is $O(K\sqrt{n \log n})$ and against stochastic bandits the pseudo-regret is $O(\sum_i (\log n)/\Delta_i)$. We also show that no algorithm with $O(\log n)$ pseudo-regret against stochastic bandits can achieve $\tilde{O}(\sqrt{n})$ expected regret against adaptive adversarial bandits. This complements previous results of Bubeck and Slivkins (2012) that show $\tilde{O}(\sqrt{n})$ expected adversarial regret with $O((\log n)2)$ stochastic pseudo-regret.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Peter Auer (14 papers)
  2. Chao-Kai Chiang (7 papers)
Citations (107)

Summary

We haven't generated a summary for this paper yet.