Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Stochastic Top-$K$ Subset Bandits with Linear Space and Non-Linear Feedback (1811.11925v2)

Published 29 Nov 2018 in cs.LG, cs.AI, and stat.ML

Abstract: Many real-world problems like Social Influence Maximization face the dilemma of choosing the best $K$ out of $N$ options at a given time instant. This setup can be modeled as a combinatorial bandit which chooses $K$ out of $N$ arms at each time, with an aim to achieve an efficient trade-off between exploration and exploitation. This is the first work for combinatorial bandits where the feedback received can be a non-linear function of the chosen $K$ arms. The direct use of multi-armed bandit requires choosing among $N$-choose-$K$ options making the state space large. In this paper, we present a novel algorithm which is computationally efficient and the storage is linear in $N$. The proposed algorithm is a divide-and-conquer based strategy, that we call CMAB-SM. Further, the proposed algorithm achieves a \textit{regret bound} of $\tilde O(K{\frac{1}{2}}N{\frac{1}{3}}T{\frac{2}{3}})$ for a time horizon $T$, which is \textit{sub-linear} in all parameters $T$, $N$, and $K$. %When applied to the problem of Social Influence Maximization, the performance of the proposed algorithm surpasses the UCB algorithm and some more sophisticated domain-specific methods.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Mridul Agarwal (16 papers)
  2. Vaneet Aggarwal (222 papers)
  3. Christopher J. Quinn (10 papers)
  4. Abhishek K. Umrawal (9 papers)
Citations (9)

Summary

We haven't generated a summary for this paper yet.