Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 63 tok/s
Gemini 2.5 Pro 44 tok/s Pro
GPT-5 Medium 31 tok/s Pro
GPT-5 High 32 tok/s Pro
GPT-4o 86 tok/s Pro
Kimi K2 194 tok/s Pro
GPT OSS 120B 445 tok/s Pro
Claude Sonnet 4.5 35 tok/s Pro
2000 character limit reached

Leveraging heterogeneous spillover in maximizing contextual bandit rewards (2310.10259v2)

Published 16 Oct 2023 in cs.LG and cs.SI

Abstract: Recommender systems relying on contextual multi-armed bandits continuously improve relevant item recommendations by taking into account the contextual information. The objective of bandit algorithms is to learn the best arm (e.g., best item to recommend) for each user and thus maximize the cumulative rewards from user engagement with the recommendations. The context that these algorithms typically consider are the user and item attributes. However, in the context of social networks where $\textit{the action of one user can influence the actions and rewards of other users,}$ neighbors' actions are also a very important context, as they can have not only predictive power but also can impact future rewards through spillover. Moreover, influence susceptibility can vary for different people based on their preferences and the closeness of ties to other users which leads to heterogeneity in the spillover effects. Here, we present a framework that allows contextual multi-armed bandits to account for such heterogeneous spillovers when choosing the best arm for each user. Our experiments on several semi-synthetic and real-world datasets show that our framework leads to significantly higher rewards than existing state-of-the-art solutions that ignore the network information and potential spillover.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (26)
  1. Thompson sampling for contextual bandits with linear payoffs. In International conference on machine learning, 127–135. PMLR.
  2. A neural networks committee for the contextual bandit problem. In International Conference on Neural Information Processing, 374–381. Springer.
  3. Estimating average causal effects under general interference, with application to a social network experiment. The Annals of Applied Statistics, 11(4): 1912–1947.
  4. Heterogeneous treatment and spillover effects under clustered network interference. arXiv preprint arXiv:2008.00707.
  5. Spillover effects in seeded word-of-mouth marketing campaigns. Marketing Science, 36(1): 89–104.
  6. Contextual bandits with linear payoff functions. In Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, 208–214. JMLR Workshop and Conference Proceedings.
  7. Exploring encouragement, treatment, and spillover effects using principal stratification, with application to a field experiment on teens’ museum attendance. Journal of Business & Economic Statistics, 39(1): 244–258.
  8. Marketplace or reseller? Management Science, 61(1): 184–203.
  9. Finding structure with randomness: Stochastic algorithms for constructing approximate matrix decompositions.
  10. Reducing interference bias in online marketplace pricing experiments. arXiv preprint arXiv:2004.12489.
  11. Contextual Bandits for Advertising Campaigns: A Diffusion-Model Independent Approach. In Proceedings of the 2022 SIAM International Conference on Data Mining (SDM), 513–521. SIAM.
  12. Learning Neural Contextual Bandits through Perturbed Rewards. In International Conference on Learning Representations.
  13. Interference, bias, and variance in two-sided marketplace experimentation: Guidance for platforms. In Proceedings of the ACM Web Conference 2022, 182–192.
  14. A contextual-bandit approach to personalized news article recommendation. In Proceedings of the 19th international conference on World wide web, 661–670.
  15. Contextual multi-armed bandits. In Proceedings of the Thirteenth international conference on Artificial Intelligence and Statistics, 485–492. JMLR Workshop and Conference Proceedings.
  16. Characterizing and detecting hateful users on twitter. In Twelfth international AAAI conference on web and social media.
  17. Does it matter when your smartest peers leave your class? Evidence from Hungary. Labour Economics, 59: 79–91.
  18. The independent cascade and linear threshold models. In Diffusion in Social Networks, 35–48. Springer.
  19. An MDP-based recommender system. Journal of Machine Learning Research, 6(9).
  20. Model-independent online learning for influence maximization. In International Conference on Machine Learning, 3530–3539. PMLR.
  21. Online influence maximization under independent cascade model with semi-bandit feedback. Advances in neural information processing systems, 30.
  22. Maximizing influence in an unknown social network. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 32.
  23. Strategic introduction of the marketplace channel under spillovers from online to offline sales. European Journal of Operational Research, 267(1): 65–77.
  24. A Two-Part Machine Learning Approach to Characterizing Network Interference in A/B Testing.
  25. Neural thompson sampling. arXiv preprint arXiv:2010.00827.
  26. Neural contextual bandits with ucb-based exploration. In International Conference on Machine Learning, 11492–11502. PMLR.

Summary

We haven't generated a summary for this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.