Incentivized Exploration via Filtered Posterior Sampling (2402.13338v1)
Abstract: We study "incentivized exploration" (IE) in social learning problems where the principal (a recommendation algorithm) can leverage information asymmetry to incentivize sequentially-arriving agents to take exploratory actions. We identify posterior sampling, an algorithmic approach that is well known in the multi-armed bandits literature, as a general-purpose solution for IE. In particular, we expand the existing scope of IE in several practically-relevant dimensions, from private agent types to informative recommendations to correlated Bayesian priors. We obtain a general analysis of posterior sampling in IE which allows us to subsume these extended settings as corollaries, while also recovering existing results as special cases.
- Agrawal S, Goyal N (2012) Analysis of Thompson Sampling for the multi-armed bandit problem. 25nd Conf. on Learning Theory (COLT).
- Ando T (1995) Majorization relations for hadamard products. Linear algebra and its applications 223:57–64.
- Bergemann D, Morris S (2019) Information design: A unified perspective. Journal of Economic Literature 57(1):44–95.
- Goldenshluger A, Zeevi A (2011) A note on performance limitations in bandit problems with side information. IEEE transactions on information theory 57(3):1707–1713.
- Kamenica E (2019) Bayesian persuasion and information design. Annual Review of Economics 11(1):249–272.
- Kremer I, Mansour Y, Perry M (2014) Implementing the “wisdom of the crowd”. J. of Political Economy 122(5):988–1012, preliminary version in ACM EC 2013.
- Lai TL, Robbins H (1985) Asymptotically efficient Adaptive Allocation Rules. Advances in Applied Mathematics 6:4–22.
- Russo D, Van Roy B (2014) Learning to optimize via posterior sampling. Mathematics of Operations Research 39(4):1221–1243.
- Russo D, Van Roy B (2016) An information-theoretic analysis of thompson sampling. J. of Machine Learning Research (JMLR) 17:68:1–68:30.
- Sellke M (2023) Incentivizing exploration with linear contexts and combinatorial actions. arXiv preprint arXiv:2306.01990 .
- Shamir O (2011) A variant of azuma’s inequality for martingales with subgaussian tails. arXiv preprint arXiv:1110.2392 .
- Thompson WR (1933) On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika 25(3-4):285–294.
- Tsybakov AB (2004) Optimal aggregation of classifiers in statistical learning. The Annals of Statistics 32(1):135–166.
Sponsor
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.