Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
144 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Neural Contextual Bandits for Personalized Recommendation (2312.14037v1)

Published 21 Dec 2023 in cs.IR, cs.AI, and cs.LG

Abstract: In the dynamic landscape of online businesses, recommender systems are pivotal in enhancing user experiences. While traditional approaches have relied on static supervised learning, the quest for adaptive, user-centric recommendations has led to the emergence of the formulation of contextual bandits. This tutorial investigates the contextual bandits as a powerful framework for personalized recommendations. We delve into the challenges, advanced algorithms and theories, collaborative strategies, and open challenges and future prospects within this field. Different from existing related tutorials, (1) we focus on the exploration perspective of contextual bandits to alleviate the ``Matthew Effect'' in the recommender systems, i.e., the rich get richer and the poor get poorer, concerning the popularity of items; (2) in addition to the conventional linear contextual bandits, we will also dedicated to neural contextual bandits which have emerged as an important branch in recent years, to investigate how neural networks benefit contextual bandits for personalized recommendation both empirically and theoretically; (3) we will cover the latest topic, collaborative neural contextual bandits, to incorporate both user heterogeneity and user correlations customized for recommender system; (4) we will provide and discuss the new emerging challenges and open questions for neural contextual bandits with applications in the personalized recommendation, especially for large neural models.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (39)
  1. Improved algorithms for linear stochastic bandits. In Advances in Neural Information Processing Systems, pages 2312–2320, 2011.
  2. S. Agrawal and N. Goyal. Thompson sampling for contextual bandits with linear payoffs. In International Conference on Machine Learning, pages 127–135. PMLR, 2013.
  3. Finite-time analysis of the multiarmed bandit problem. Machine learning, 47(2-3):235–256, 2002.
  4. Y. Ban and J. He. Generic outlier detection in multi-armed bandit. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 913–923, 2020.
  5. Y. Ban and J. He. Convolutional neural bandit: Provable algorithm for visual-aware advertising. arXiv preprint arXiv:2107.07438, 2021a.
  6. Y. Ban and J. He. Local clustering in contextual multi-armed bandits. In Proceedings of the Web Conference 2021, pages 2335–2346, 2021b.
  7. Multi-facet contextual bandits: A neural network perspective. In The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Virtual Event, Singapore, August 14-18, 2021, pages 35–45, 2021.
  8. Neural collaborative filtering bandits via meta learning. arXiv preprint arXiv:2201.13395, 2022a.
  9. EE-net: Exploitation-exploration neural networks in contextual bandits. In International Conference on Learning Representations, 2022b. URL https://openreview.net/forum?id=X_ch3VrNSRg.
  10. Improved algorithms for neural active learning. Advances in Neural Information Processing Systems, 35:27497–27509, 2022c.
  11. Neural exploitation and exploration of contextual bandits. arXiv preprint arXiv:2305.03784, 2023.
  12. Off-policy actor-critic for recommender systems. In Proceedings of the 16th ACM Conference on Recommender Systems, pages 338–349, 2022.
  13. Federated neural bandit. arXiv preprint arXiv:2205.14309, 2022.
  14. Contextual bandits with online neural regression. arXiv preprint arXiv:2312.07145, 2023.
  15. Multi-task learning for contextual bandits. In Advances in neural information processing systems, pages 4848–4856, 2017.
  16. D. Foster and A. Rakhlin. Beyond ucb: Optimal and efficient contextual bandits with regression oracles. In International Conference on Machine Learning, pages 3199–3210. PMLR, 2020.
  17. Alleviating matthew effect of offline reinforcement learning in interactive recommendation. arXiv preprint arXiv:2307.04571, 2023.
  18. Online clustering of bandits. In International Conference on Machine Learning, pages 757–765, 2014.
  19. On context-dependent clustering of bandits. In Proceedings of the 34th International Conference on Machine Learning-Volume 70, pages 1253–1262. JMLR. org, 2017.
  20. Neural collaborative filtering. In Proceedings of the 26th international conference on world wide web, pages 173–182, 2017.
  21. Learning neural contextual bandits through perturbed rewards. In International Conference on Learning Representations, 2022.
  22. P. Kassraie and A. Krause. Neural contextual bandits without regret. In International Conference on Artificial Intelligence and Statistics, pages 240–278. PMLR, 2022.
  23. Distributed clustering of linear bandits in peer to peer networks. In Journal of machine learning research workshop and conference proceedings, volume 48, pages 1301–1309. International Machine Learning Societ, 2016.
  24. A contextual-bandit approach to personalized news article recommendation. In Proceedings of the 19th international conference on World wide web, pages 661–670, 2010a.
  25. Collaborative filtering bandits. In Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval, pages 539–548, 2016.
  26. Improved algorithm on online clustering of bandits. In Proceedings of the 28th International Joint Conference on Artificial Intelligence, pages 2923–2929. AAAI Press, 2019.
  27. Exploitation and exploration in a performance based contextual advertising system. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 27–36, 2010b.
  28. Impatient bandits: Optimizing recommendations for the long-term without delay. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 1687–1697, 2023.
  29. Neural bandit with arm group graph. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 1379–1389, 2022.
  30. Graph neural bandits. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 1920–1931, 2023a.
  31. Meta-learning with neural bandit scheduler. In Thirty-seventh Conference on Neural Information Processing Systems, 2023b.
  32. A. Slivkins et al. Introduction to multi-armed bandits. Foundations and Trends® in Machine Learning, 12(1-2):1–286, 2019.
  33. X. Su and T. M. Khoshgoftaar. A survey of collaborative filtering techniques. Advances in artificial intelligence, 2009, 2009.
  34. Finite-time analysis of kernelised contextual bandits. arXiv preprint arXiv:1309.6869, 2013.
  35. Clustering of conversational bandits for user preference learning and elicitation. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management, pages 2129–2139, 2021.
  36. Resact: Reinforcing long-term engagement in sequential recommendation with residual actor. arXiv preprint arXiv:2206.02620, 2022.
  37. Exploring clustering of bandits for online recommendation system. In Fourteenth ACM Conference on Recommender Systems, pages 120–129, 2020.
  38. Neural thompson sampling. In International Conference on Learning Representations, 2021.
  39. Neural contextual bandits with ucb-based exploration. In International Conference on Machine Learning, pages 11492–11502. PMLR, 2020.
Citations (2)

Summary

We haven't generated a summary for this paper yet.