Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Adaptive Generalized Neyman Allocation: Local Asymptotic Minimax Optimal Best Arm Identification (2405.19317v1)

Published 29 May 2024 in cs.LG, cs.AI, econ.EM, stat.ME, and stat.ML

Abstract: This study investigates a local asymptotic minimax optimal strategy for fixed-budget best arm identification (BAI). We propose the Adaptive Generalized Neyman Allocation (AGNA) strategy and show that its worst-case upper bound of the probability of misidentifying the best arm aligns with the worst-case lower bound under the small-gap regime, where the gap between the expected outcomes of the best and suboptimal arms is small. Our strategy corresponds to a generalization of the Neyman allocation for two-armed bandits (Neyman, 1934; Kaufmann et al., 2016) and a refinement of existing strategies such as the ones proposed by Glynn & Juneja (2004) and Shin et al. (2018). Compared to Komiyama et al. (2022), which proposes a minimax rate-optimal strategy, our proposed strategy has a tighter upper bound that exactly matches the lower bound, including the constant terms, by restricting the class of distributions to the class of small-gap distributions. Our result contributes to the longstanding open issue about the existence of asymptotically optimal strategies in fixed-budget BAI, by presenting the local asymptotic minimax optimal strategy.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (48)
  1. Policy choice and best arm identification: Asymptotic analysis of exploration sampling, 2021. arXiv:2109.08229.
  2. Bayesian fixed-budget best-arm identification, 2023. arXiv:2211.08572.
  3. Best arm identification in multi-armed bandits. In Conference on Learning Theory, pp.  41–53, 2010.
  4. R. R. Bahadur. Stochastic Comparison of Tests. The Annals of Mathematical Statistics, 31(2):276 – 295, 1960.
  5. Sequential nonparametric testing with the law of the iterated logarithm. In Alexander T. Ihler and Dominik Janzing (eds.), Conference on Uncertainty in Artificial Intelligence, 2016.
  6. On best-arm identification with a fixed budget in non-parametric multi-armed bandits. In International Conference on Algorithmic Learning Theory (AISTATS), 2023.
  7. Pure exploration in multi-armed bandits problems. In Algorithmic Learning Theory, pp.  23–37. Springer Berlin Heidelberg, 2009.
  8. Pure exploration in finitely-armed and continuous-armed bandits. Theoretical Computer Science, 2011.
  9. Geometric Modeling in Probability and Statistics. Mathematics and Statistics. Springer International Publishing, 2014.
  10. Tight (lower) bounds for the fixed budget best arm identification bandit problem. In COLT, 2016.
  11. Simulation budget allocation for further enhancing theefficiency of ordinal optimization. Discrete Event Dynamic Systems, 10(3):251–270, 2000.
  12. Semiparametric efficient inference in adaptive experiments. In NeurIPS 2023 Workshop on Adaptive Experimental Design and Active Learning in the Real World, 2023. a]rXiv:2311.18274.
  13. Rémy Degenne. On the existence of a complexity in fixed budget bandit identification. In Conference on Learning Theory, volume 195, pp. 1131–1154. PMLR, 2023.
  14. The method of cumulants for the normal approximation. Probability Surveys, 19(none):185 – 270, 2022.
  15. John Duchi. Lecture notes on statistics and information theory, 2023. URL https://web.stanford.edu/class/stats311/lecture-notes.pdf.
  16. Richard S. Ellis. Large Deviations for a General Class of Random Vectors. The Annals of Probability, 12(1):1 – 12, 1984.
  17. Optimal best arm identification with fixed confidence. In Conference on Learning Theory, 2016.
  18. Jürgen Gärtner. On large deviations from the invariant measure. Theory of Probability & Its Applications, 22(1):24–39, 1977.
  19. A large deviations perspective on ordinal optimization. In Proceedings of the 2004 Winter Simulation Conference, volume 1. IEEE, 2004.
  20. Adaptive experimental design using the propensity score. Journal of Business and Economic Statistics, 2011.
  21. Efficient estimation of average treatment effects using the estimated propensity score. Econometrica, 2003.
  22. Time-uniform, nonparametric, nonasymptotic confidence sequences. Annals of Statistics, 2021.
  23. Almost optimal exploration in multi-armed bandits. In International Conference on Machine Learning, 2013.
  24. Adaptive treatment assignment in experiments for policy choice. Econometrica, 89(1):113–132, 2021.
  25. Masahiro Kato. Locally optimal fixed-budget best arm identification in two-armed gaussian bandits with unknown variances, 2024a. arXiv:2312.12741.
  26. Masahiro Kato. Worst-case optimal multi-armed gaussian best arm identification with a fixed budget, 2024b. arXiv:2310.19788.
  27. Efficient adaptive experimental design for average treatment effect estimation, 2020. arXiv:2002.05308.
  28. Emilie Kaufmann. Contributions to the Optimal Solution of Several Bandits Problems. Habilitation á Diriger des Recherches, Université de Lille, 2020. URL https://emiliekaufmann.github.io/HDR_EmilieKaufmann.pdf.
  29. On the complexity of best-arm identification in multi-armed bandit models. Journal of Machine Learning Research, 17(1):1–42, 2016.
  30. Minimax optimal algorithms for fixed-budget best arm identification. In Advances in Neural Information Processing Systems, 2022.
  31. Rate-optimal bayesian simple regret in best arm identification. Mathematics of Operations Research, 2023.
  32. Asymptotically efficient adaptive allocation rules. Advances in Applied Mathematics, 1985.
  33. Fixed-budget best-arm identification with heterogeneous reward variances. In Conference on Uncertainty in Artificial Intelligence, 2023.
  34. L Le Cam. Limits of experiments. In Theory of Statistics, pp.  245–282. University of California Press, 1972.
  35. Lucien Le Cam. Asymptotic Methods in Statistical Decision Theory (Springer Series in Statistics). Springer, 1986.
  36. Theory of Point Estimation. Springer-Verlag, 1998.
  37. Jerzy Neyman. Sur les applications de la theorie des probabilites aux experiences agricoles: Essai des principes. Statistical Science, 5:463–472, 1923.
  38. Jerzy Neyman. On the two different aspects of the representative method: the method of stratified sampling and the method of purposive selection. Journal of the Royal Statistical Society, 97:123–150, 1934.
  39. Prior-dependent allocations for bayesian fixed-budget best-arm identification in structured bandits, 2024. arXiv:2402.05878.
  40. Chao Qin. Open problem: Optimal best arm identification with fixed-budget. In Conference on Learning Theory, 2022.
  41. Donald B. Rubin. Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology, 1974.
  42. Tractable sampling strategies for ordinal optimization. Operations Research, 66(6):1693–1712, 2018.
  43. Mark J. van der Laan. The construction and analysis of adaptive group sequential designs, 2008. URL https://biostats.bepress.com/ucbbiostat/paper232.
  44. A.W. van der Vaart. An asymptotic representation theorem. International Statistical Review / Revue Internationale de Statistique, 59(1):97–121, 1991.
  45. A.W. van der Vaart. Asymptotic Statistics. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, 1998.
  46. A. Wald. Sequential tests of statistical hypotheses. The Annals of Mathematical Statistics, 16(2):117–186, 1945.
  47. On uniformly optimal algorithms for best arm identification in two-armed bandits with fixed budget. In International Conference on Machine Learning (ICML), 2024.
  48. Minimax optimal fixed-budget best arm identification in linear bandits. In Advances in Neural Information Processing Systems, 2022.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (1)
  1. Masahiro Kato (50 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.