Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Multi-Armed Bandits with Metric Movement Costs (1710.08997v1)

Published 24 Oct 2017 in cs.LG

Abstract: We consider the non-stochastic Multi-Armed Bandit problem in a setting where there is a fixed and known metric on the action space that determines a cost for switching between any pair of actions. The loss of the online learner has two components: the first is the usual loss of the selected actions, and the second is an additional loss due to switching between actions. Our main contribution gives a tight characterization of the expected minimax regret in this setting, in terms of a complexity measure $\mathcal{C}$ of the underlying metric which depends on its covering numbers. In finite metric spaces with $k$ actions, we give an efficient algorithm that achieves regret of the form $\widetilde{O}(\max{\mathcal{C}{1/3}T{2/3},\sqrt{kT}})$, and show that this is the best possible. Our regret bound generalizes previous known regret bounds for some special cases: (i) the unit-switching cost regret $\widetilde{\Theta}(\max{k{1/3}T{2/3},\sqrt{kT}})$ where $\mathcal{C}=\Theta(k)$, and (ii) the interval metric with regret $\widetilde{\Theta}(\max{T{2/3},\sqrt{kT}})$ where $\mathcal{C}=\Theta(1)$. For infinite metrics spaces with Lipschitz loss functions, we derive a tight regret bound of $\widetilde{\Theta}(T{\frac{d+1}{d+2}})$ where $d \ge 1$ is the Minkowski dimension of the space, which is known to be tight even when there are no switching costs.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Tomer Koren (79 papers)
  2. Roi Livni (35 papers)
  3. Yishay Mansour (158 papers)
Citations (21)

Summary

We haven't generated a summary for this paper yet.