Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
194 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Approximate Top-$m$ Arm Identification with Heterogeneous Reward Variances (2204.05245v1)

Published 11 Apr 2022 in cs.LG, cs.IT, and math.IT

Abstract: We study the effect of reward variance heterogeneity in the approximate top-$m$ arm identification setting. In this setting, the reward for the $i$-th arm follows a $\sigma2_i$-sub-Gaussian distribution, and the agent needs to incorporate this knowledge to minimize the expected number of arm pulls to identify $m$ arms with the largest means within error $\epsilon$ out of the $n$ arms, with probability at least $1-\delta$. We show that the worst-case sample complexity of this problem is $$\Theta\left( \sum_{i =1}n \frac{\sigma_i2}{\epsilon2} \ln\frac{1}{\delta} + \sum_{i \in G{m}} \frac{\sigma_i2}{\epsilon2} \ln(m) + \sum_{j \in G{l}} \frac{\sigma_j2}{\epsilon2} \text{Ent}(\sigma2_{G{r}}) \right),$$ where $G{m}, G{l}, G{r}$ are certain specific subsets of the overall arm set ${1, 2, \ldots, n}$, and $\text{Ent}(\cdot)$ is an entropy-like function which measures the heterogeneity of the variance proxies. The upper bound of the complexity is obtained using a divide-and-conquer style algorithm, while the matching lower bound relies on the study of a dual formulation.

Citations (3)

Summary

We haven't generated a summary for this paper yet.