2000 character limit reached
Optimal Thresholding Linear Bandit
Published 11 Feb 2024 in stat.ML and cs.LG | (2402.09467v1)
Abstract: We study a novel pure exploration problem: the $\epsilon$-Thresholding Bandit Problem (TBP) with fixed confidence in stochastic linear bandits. We prove a lower bound for the sample complexity and extend an algorithm designed for Best Arm Identification in the linear case to TBP that is asymptotically optimal.
- Andrea Locatelli, A. C., Maurilio Gutzeit An optimal algorithm for the Thresholding Bandit Problem. 2016, 1690–1698.
- Wei Chen, Y. Y., Yajun Wang Combinatorial Multi-Armed Bandit: General Framework, Results and Applications. International Conference on Machine Learning. 2013; pp 151–159.
- Auer, P. Using Confidence Bounds for Exploitation-Exploration Trade-offs. Journal of Machine Learning Research 2002, 397–422.
- Robbins, H. Some aspects of the sequential design of experiments. Bulletin of the American Mathematical Society 1952, 527–535.
- Paat Rusmevichientong, J. N. T. Linearly Parameterized Bandits. Mathematics of Operations Research 2010, 35.
- Marta Soare, R., Alessandro Lazaric Best-Arm Identification in Linear Bandits. Advances in neural information processing systems 2014, 27.
- Soare, M. Sequential Resource Allocation in Linear Stochastic Bandits. Sequential Resource Allocation in Linear Stochastic Bandits. 2015.
- Yassir Jedra, A. P. Optimal Best-arm Identification in Linear Bandits. Advances in neural information processing systems 2020, 33.
- Degenne, R.; Menard, P.; Shang, X.; Valko, M. Gamification of Pure Exploration for Linear Bandits. Proceedings of the 37th International Conference on Machine Learning. 2020; pp 2432–2442.
- Degenne, R.; Koolen, W. M. Pure Exploration with Multiple Correct Answers. Advances in Neural Information Processing Systems. 2019.
- Tao, C.; Blanco, S.; Zhou, Y. Best Arm Identification in Linear Bandits with Linear Dimension Dependency. International Conference on Machine Learning. 2018; pp 4884–4893.
- Xu, L.; Honda, J.; Sugiyama, M. A fully adaptive algorithm for pure exploration in linear bandits. Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics. 2018; pp 843–851.
- Gabillon, V.; Ghavamzadeh, M.; Lazaric, A. Best Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence. Advances in Neural Information Processing Systems. 2012; pp 3212–3220.
- Garivier, A.; Kaufmann, E. Optimal Best Arm Identification with Fixed Confidence. 29th Annual Conference on Learning Theory. 2016; pp 998–1027.
- Fiez, T.; Jain, L.; Jamieson, K. G.; Ratliff, L. Sequential Experimental Design for Transductive Linear Bandits. Advances in Neural Information Processing Systems. 2019.
- Tor Lattimore, C. S. Bandit Algorithms; Cambridge University Press, 2018.
- Sundaram, R. K.; others A First Course in Optimization Theory; Cambridge University Press, 1996.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.