Incentivized Exploration of Non-Stationary Stochastic Bandits (2403.10819v1)

Published 16 Mar 2024 in cs.LG, cs.AI, and stat.ML

Abstract: We study incentivized exploration for the multi-armed bandit (MAB) problem with non-stationary reward distributions, where players receive compensation for exploring arms other than the greedy choice and may provide biased feedback on the reward. We consider two different non-stationary environments: abruptly-changing and continuously-changing, and propose respective incentivized exploration algorithms. We show that the proposed algorithms achieve sublinear regret and compensation over time, thus effectively incentivizing exploration despite the nonstationarity and the biased or drifted feedback.

References (33)

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Tweets

https://twitter.com/StatMLPapers/status/1769937786923040858

Incentivized Exploration of Non-Stationary Stochastic Bandits (2403.10819v1)

Summary

Related Papers

Tweets