Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Revisiting Weighted Strategy for Non-stationary Parametric Bandits (2303.02691v2)

Published 5 Mar 2023 in cs.LG and stat.ML

Abstract: Non-stationary parametric bandits have attracted much attention recently. There are three principled ways to deal with non-stationarity, including sliding-window, weighted, and restart strategies. As many non-stationary environments exhibit gradual drifting patterns, the weighted strategy is commonly adopted in real-world applications. However, previous theoretical studies show that its analysis is more involved and the algorithms are either computationally less efficient or statistically suboptimal. This paper revisits the weighted strategy for non-stationary parametric bandits. In linear bandits (LB), we discover that this undesirable feature is due to an inadequate regret analysis, which results in an overly complex algorithm design. We propose a refined analysis framework, which simplifies the derivation and importantly produces a simpler weight-based algorithm that is as efficient as window/restart-based algorithms while retaining the same regret as previous studies. Furthermore, our new framework can be used to improve regret bounds of other parametric bandits, including Generalized Linear Bandits (GLB) and Self-Concordant Bandits (SCB). For example, we develop a simple weighted GLB algorithm with an $\widetilde{O}(k_\mu{\frac{5}{4}} c_\mu{-\frac{3}{4}} d{\frac{3}{4}} P_T{\frac{1}{4}}T{\frac{3}{4}})$ regret, improving the $\widetilde{O}(k_\mu{2} c_\mu{-1}d{\frac{9}{10}} P_T{\frac{1}{5}}T{\frac{4}{5}})$ bound in prior work, where $k_\mu$ and $c_\mu$ characterize the reward model's nonlinearity, $P_T$ measures the non-stationarity, $d$ and $T$ denote the dimension and time horizon.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Jing Wang (740 papers)
  2. Peng Zhao (162 papers)
  3. Zhi-Hua Zhou (126 papers)
Citations (5)

Summary

We haven't generated a summary for this paper yet.