Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 134 tok/s
Gemini 2.5 Pro 41 tok/s Pro
GPT-5 Medium 30 tok/s Pro
GPT-5 High 37 tok/s Pro
GPT-4o 98 tok/s Pro
Kimi K2 195 tok/s Pro
GPT OSS 120B 442 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

High-dimensional Contextual Bandit Problem without Sparsity (2306.11017v2)

Published 19 Jun 2023 in stat.ML and cs.LG

Abstract: In this research, we investigate the high-dimensional linear contextual bandit problem where the number of features $p$ is greater than the budget $T$, or it may even be infinite. Differing from the majority of previous works in this field, we do not impose sparsity on the regression coefficients. Instead, we rely on recent findings on overparameterized models, which enables us to analyze the performance of the minimum-norm interpolating estimator when data distributions have small effective ranks. We propose an explore-then-commit (EtC) algorithm to address this problem and examine its performance. Through our analysis, we derive the optimal rate of the ETC algorithm in terms of $T$ and show that this rate can be achieved by balancing exploration and exploitation. Moreover, we introduce an adaptive explore-then-commit (AEtC) algorithm that adaptively finds the optimal balance. We assess the performance of the proposed algorithms through a series of simulations.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (28)
  1. Improved algorithms for linear stochastic bandits. Advances in neural information processing systems, 24, 2011.
  2. Associative reinforcement learning using linear probabilistic concepts. In Proceedings of the Sixteenth International Conference on Machine Learning, ICML ’99, page 3–11, San Francisco, CA, USA, 1999. Morgan Kaufmann Publishers Inc. ISBN 1558606122.
  3. Personalized click shaping through lagrangian duality for online recommendation. In SIGIR ’12, 2012.
  4. Finite-time analysis of the multiarmed bandit problem. Mach. Learn., 47(2-3):235–256, 2002. doi: 10.1023/A:1013689704352.
  5. Benign overfitting in linear regression. Proceedings of the National Academy of Sciences, 117(48):30063–30070, 2020.
  6. Online decision making with high-dimensional covariates. Oper. Res., 68(1):276–294, 2020.
  7. Denis Bosq. Linear processes in function spaces: theory and applications, volume 149. Springer Science & Business Media, 2000.
  8. Dynamic treatment regimes. Annual Review of Statistics and Its Application, 1(1):447–464, 2014.
  9. Contextual bandits with linear payoff functions. In Geoffrey Gordon, David Dunson, and Miroslav Dudík, editors, Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, volume 15 of Proceedings of Machine Learning Research, pages 208–214. PMLR, 11–13 Apr 2011.
  10. High-dimensional sparse linear bandits. Advances in Neural Information Processing Systems, 33:10753–10763, 2020.
  11. Enumerate lasso solutions for feature selection. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 31, 2017.
  12. Popart: Efficient sparse regression and experimental design for optimal sparse linear bandits. In NeurIPS, 2022.
  13. Doubly-robust lasso bandit. Advances in Neural Information Processing Systems, 32, 2019.
  14. Concentration inequalities and moment bounds for sample covariance operators. Bernoulli, 23(1):110–133, 2017.
  15. Asymptotically efficient adaptive allocation rules. Advances in Applied Mathematics, 6(1):4–22, 1985. ISSN 0196-8858.
  16. A contextual-bandit approach to personalized news article recommendation. In Proceedings of the 19th International Conference on World Wide Web, WWW ’10, page 661–670, New York, NY, USA, 2010. Association for Computing Machinery. ISBN 9781605587998.
  17. A simple unified framework for high dimensional bandit problems. In International Conference on Machine Learning, pages 12619–12655. PMLR, 2022.
  18. The distribution of the lasso: Uniform control over sparse balls and adaptive parameter tuning. The Annals of Statistics, 49(4):2313–2335, 2021.
  19. Benign overfitting in time series linear model with over-parameterization. arXiv preprint arXiv:2204.08369, 2022.
  20. Sparsity-agnostic lasso bandit. In International Conference on Machine Learning, pages 8271–8280. PMLR, 2021.
  21. Steffen Rendle. Factorization machines. In Geoffrey I. Webb, Bing Liu, Chengqi Zhang, Dimitrios Gunopulos, and Xindong Wu, editors, ICDM 2010, The 10th IEEE International Conference on Data Mining, Sydney, Australia, 14-17 December 2010, pages 995–1000. IEEE Computer Society, 2010.
  22. Herbert Robbins. Some aspects of the sequential design of experiments. Bulletin of the American Mathematical Society, 58(5):527 – 535, 1952.
  23. Automatic ad format selection via contextual bandits. In Qi He, Arun Iyengar, Wolfgang Nejdl, Jian Pei, and Rajeev Rastogi, editors, 22nd ACM International Conference on Information and Knowledge Management, CIKM’13, pages 1587–1594. ACM, 2013.
  24. Benign overfitting in ridge regression. arXiv preprint arXiv:2009.14286, 2020.
  25. Sara A Van De Geer and Peter Bühlmann. On the conditions used to prove oracle results for the lasso. Electronic Journal of Statistics, 3:1360–1392, 2009.
  26. Roman Vershynin. High-dimensional probability: An introduction with applications in data science, volume 47. Cambridge university press, 2018.
  27. Minimax concave penalized multi-armed bandit model with high-dimensional covariates. In International Conference on Machine Learning, pages 5200–5208. PMLR, 2018.
  28. Recommending for a multi-sided marketplace with heterogeneous contents. In Proceedings of the 16th ACM Conference on Recommender Systems, RecSys ’22, page 456–459. Association for Computing Machinery, 2022. ISBN 9781450392785.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube