Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Distributed Bandit Learning: Near-Optimal Regret with Efficient Communication (1904.06309v2)

Published 12 Apr 2019 in cs.LG and stat.ML

Abstract: We study the problem of regret minimization for distributed bandits learning, in which $M$ agents work collaboratively to minimize their total regret under the coordination of a central server. Our goal is to design communication protocols with near-optimal regret and little communication cost, which is measured by the total amount of transmitted data. For distributed multi-armed bandits, we propose a protocol with near-optimal regret and only $O(M\log(MK))$ communication cost, where $K$ is the number of arms. The communication cost is independent of the time horizon $T$, has only logarithmic dependence on the number of arms, and matches the lower bound except for a logarithmic factor. For distributed $d$-dimensional linear bandits, we propose a protocol that achieves near-optimal regret and has communication cost of order $\tilde{O}(Md)$, which has only logarithmic dependence on $T$.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Yuanhao Wang (30 papers)
  2. Jiachen Hu (12 papers)
  3. Xiaoyu Chen (126 papers)
  4. Liwei Wang (239 papers)
Citations (7)

Summary

We haven't generated a summary for this paper yet.