Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
156 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Multiclass Online Learnability under Bandit Feedback (2308.04620v3)

Published 8 Aug 2023 in cs.LG and stat.ML

Abstract: We study online multiclass classification under bandit feedback. We extend the results of Daniely and Helbertal [2013] by showing that the finiteness of the Bandit Littlestone dimension is necessary and sufficient for bandit online learnability even when the label space is unbounded. Moreover, we show that, unlike the full-information setting, sequential uniform convergence is necessary but not sufficient for bandit online learnability. Our result complements the recent work by Hanneke, Moran, Raman, Subedi, and Tewari [2023] who show that the Littlestone dimension characterizes online multiclass learnability in the full-information setting even when the label space is unbounded.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (19)
  1. Adversarial laws of large numbers and optimal regret in online classification. In Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing, pages 447–455, 2021.
  2. Structural results about on-line learning models with and without queries. Machine Learning, 36:147–181, 1999.
  3. Agnostic online learning. In COLT, volume 3, page 1, 2009.
  4. A characterization of multiclass learnability. In 2022 IEEE 63rd Annual Symposium on Foundations of Computer Science (FOCS), pages 943–955. IEEE, 2022.
  5. Regret analysis of stochastic and nonstochastic multi-armed bandit problems. Foundations and Trends® in Machine Learning, 5(1):1–122, 2012.
  6. The price of bandit information in multiclass online classification. In Conference on Learning Theory, pages 93–104. PMLR, 2013.
  7. Optimal learners for multiclass problems. In Conference on Learning Theory, pages 287–316. PMLR, 2014.
  8. Multiclass learnability and the erm principle. In Proceedings of the 24th Annual Conference on Learning Theory, pages 207–232. JMLR Workshop and Conference Proceedings, 2011.
  9. Optimal prediction using expert advice and randomized littlestone dimension. In COLT, volume 195 of Proceedings of Machine Learning Research, pages 773–836. PMLR, 2023.
  10. Jesse Geneson. A note on the price of bandit feedback for mistake-bounded online learning. Theoretical Computer Science, 874:42–45, 2021.
  11. Multiclass online learning and uniform convergence. Proceedings of the 36th Annual Conference on Learning Theory (COLT), 2023.
  12. Efficient bandit algorithms for online multiclass prediction. In Proceedings of the 25th international conference on Machine learning, pages 440–447, 2008.
  13. Nick Littlestone. Learning quickly when irrelevant attributes abound: A new linear-threshold algorithm. Machine Learning, 2:285–318, 1987.
  14. Philip M Long. New bounds on the price of bandit feedback for mistake-bounded online multiclass learning. In International Conference on Algorithmic Learning Theory, pages 3–10. PMLR, 2017.
  15. Vc classes are adversarially robustly learnable, but only improperly. In Conference on Learning Theory, pages 2512–2530. PMLR, 2019.
  16. B. K. Natarajan. On learning sets and functions. Mach. Learn., 4(1):67–97, oct 1989. ISSN 0885-6125. 10.1023/A:1022605311895. URL https://doi.org/10.1023/A:1022605311895.
  17. Online learning via sequential complexities. J. Mach. Learn. Res., 16(1):155–186, 2015a.
  18. Sequential complexities and uniform martingale laws of large numbers. Probability theory and related fields, 161:111–153, 2015b.
  19. Theory of pattern recognition, 1974.
Citations (4)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com