Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Nearly Optimal Best-of-Both-Worlds Algorithms for Online Learning with Feedback Graphs (2206.00873v2)

Published 2 Jun 2022 in cs.LG

Abstract: This study considers online learning with general directed feedback graphs. For this problem, we present best-of-both-worlds algorithms that achieve nearly tight regret bounds for adversarial environments as well as poly-logarithmic regret bounds for stochastic environments. As Alon et al. [2015] have shown, tight regret bounds depend on the structure of the feedback graph: strongly observable graphs yield minimax regret of $\tilde{\Theta}( \alpha{1/2} T{1/2} )$, while weakly observable graphs induce minimax regret of $\tilde{\Theta}( \delta{1/3} T{2/3} )$, where $\alpha$ and $\delta$, respectively, represent the independence number of the graph and the domination number of a certain portion of the graph. Our proposed algorithm for strongly observable graphs has a regret bound of $\tilde{O}( \alpha{1/2} T{1/2} ) $ for adversarial environments, as well as of $ {O} ( \frac{\alpha (\ln T)3 }{\Delta_{\min}} ) $ for stochastic environments, where $\Delta_{\min}$ expresses the minimum suboptimality gap. This result resolves an open question raised by Erez and Koren [2021]. We also provide an algorithm for weakly observable graphs that achieves a regret bound of $\tilde{O}( \delta{1/3}T{2/3} )$ for adversarial environments and poly-logarithmic regret for stochastic environments. The proposed algorithms are based on the follow-the-regularized-leader approach combined with newly designed update rules for learning rates.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Shinji Ito (31 papers)
  2. Taira Tsuchiya (19 papers)
  3. Junya Honda (47 papers)
Citations (22)