Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
156 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Markov Persuasion Processes: Learning to Persuade from Scratch (2402.03077v2)

Published 5 Feb 2024 in cs.GT and cs.LG

Abstract: In Bayesian persuasion, an informed sender strategically discloses information to a receiver so as to persuade them to undertake desirable actions. Recently, a growing attention has been devoted to settings in which sender and receivers interact sequentially. Recently, Markov persuasion processes (MPPs) have been introduced to capture sequential scenarios where a sender faces a stream of myopic receivers in a Markovian environment. The MPPs studied so far in the literature suffer from issues that prevent them from being fully operational in practice, e.g., they assume that the sender knows receivers' rewards. We fix such issues by addressing MPPs where the sender has no knowledge about the environment. We design a learning algorithm for the sender, working with partial feedback. We prove that its regret with respect to an optimal information-disclosure policy grows sublinearly in the number of episodes, as it is the case for the loss in persuasiveness cumulated while learning. Moreover, we provide a lower bound for our setting matching the guarantees of our algorithm.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (41)
  1. E. Kamenica and M. Gentzkow. Bayesian persuasion. AM ECON REV, 101(6):2590–2615, 2011.
  2. Peter Bro Miltersen and Or Sheffet. Send mixed signals: earn more, work less. In EC, 2012.
  3. Signaling schemes for revenue maximization. ACM Transactions on Economics and Computation, 2(2):1–19, 2014.
  4. Targeting and signaling in ad auctions. In Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 2545–2563, 2018.
  5. Signaling in posted price auctions. Proceedings of the AAAI Conference on Artificial Intelligence, 36(5):4941–4948, Jun. 2022.
  6. Public signaling in bayesian ad auctions. In Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI 2022, Vienna, Austria, 23-29 July 2022, pages 39–45. ijcai.org, 2022.
  7. Mixture selection, mechanism design, and signaling. In 56th Annual Symposium on Foundations of Computer Science, pages 1426–1445, 2015.
  8. Persuading voters. American Economic Review, 2016.
  9. Persuading voters: It’s easy to whisper, it’s hard to speak loud. In AAAI, 2020a.
  10. Persuading voters in district-based elections. In AAAI, 2021.
  11. Implementing the wisdom of waze. In Twenty-Fourth International Joint Conference on Artificial Intelligence, pages 660–666, 2015.
  12. Hardness results for signaling in bayesian zero-sum and network routing games. In EC, 2016.
  13. Signaling in Bayesian network congestion games: the subtle power of symmetry. In AAAI, 2021a.
  14. Bayesian exploration: Incentivizing exploration in Bayesian games. In EC, 2016.
  15. Information disclosure as a means to security. In AAMAS, 2015.
  16. Signaling in Bayesian Stackelberg games. In AAMAS, 2016.
  17. Algorithmic aspects of private Bayesian persuasion. In ITCS, 2017.
  18. Ozan Candogan. Persuasion in networks: Public signals and k-cores. In EC, 2019.
  19. Anton Kolotilin. Experimental design to persuade. Games and Economic Behavior, 90:215–226, 2015.
  20. Stress tests and information disclosure. Journal of Economic Theory, 177:34–69, 2018.
  21. Sequential information design: Markov persuasion process and its efficient reinforcement learning. In Proceedings of the 23rd ACM Conference on Economics and Computation, EC ’22, page 471–472, New York, NY, USA, 2022. Association for Computing Machinery. ISBN 9781450391504. doi: 10.1145/3490486.3538313.
  22. Bayesian persuasion in sequential decision-making. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 5025–5033, 2022.
  23. Sequential information design: Learning to persuade in the dark. Advances in Neural Information Processing Systems, 35:15917–15928, 2022.
  24. Persuading farsighted receivers in MDPs: the power of honesty. In Advances in Neural Information Processing Systems, volume 36, pages 1–13, 2023a.
  25. Online Bayesian persuasion. In NeurIPS, pages 16188–16198, 2020b.
  26. Multi-receiver online Bayesian persuasion. In ICML, pages 1314–1323, 2021b.
  27. Learning to persuade on the fly: Robustness against ignorance. In EC, pages 927–928, 2021.
  28. Optimal rates and efficient algorithms for online Bayesian persuasion. In Proceedings of the 40th International Conference on Machine Learning, volume 202, pages 2164–2183. PMLR, 2023b.
  29. Near-optimal regret bounds for reinforcement learning. In D. Koller, D. Schuurmans, Y. Bengio, and L. Bottou, editors, Advances in Neural Information Processing Systems, volume 21. Curran Associates, Inc., 2008.
  30. Online markov decision processes. Mathematics of Operations Research, 34(3):726–736, 2009.
  31. Online markov decision processes under bandit feedback. Advances in Neural Information Processing Systems, 23, 2010.
  32. Online convex optimization in adversarial Markov decision processes. In Kamalika Chaudhuri and Ruslan Salakhutdinov, editors, Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pages 5478–5486. PMLR, 09–15 Jun 2019.
  33. Learning adversarial Markov decision processes with bandit feedback and unknown transition. In Hal Daumé III and Aarti Singh, editors, Proceedings of the 37th International Conference on Machine Learning, volume 119 of Proceedings of Machine Learning Research, pages 4860–4869. PMLR, 13–18 Jul 2020.
  34. Online learning in weakly coupled markov decision processes: A convergence time study. Proc. ACM Meas. Anal. Comput. Syst., 2(1), apr 2018. doi: 10.1145/3179415.
  35. Constrained upper confidence reinforcement learning. In Alexandre M. Bayen, Ali Jadbabaie, George Pappas, Pablo A. Parrilo, Benjamin Recht, Claire Tomlin, and Melanie Zeilinger, editors, Proceedings of the 2nd Conference on Learning for Dynamics and Control, volume 120 of Proceedings of Machine Learning Research, pages 620–629. PMLR, 10–11 Jun 2020.
  36. Exploration-exploitation in constrained mdps, 2020.
  37. Upper confidence primal-dual reinforcement learning for cmdp with adversarial loss. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems, volume 33, pages 15277–15287. Curran Associates, Inc., 2020.
  38. A best-of-both-worlds algorithm for constrained mdps with long-term constraints. arXiv preprint arXiv:2304.14326, 2023.
  39. Private bayesian persuasion. Journal of Economic Theory, 182:185–217, 2019.
  40. Prediction, learning, and games. Cambridge university press, 2006.
  41. Francesco Orabona. A modern introduction to online learning. CoRR, abs/1912.13213, 2019.
Citations (7)

Summary

We haven't generated a summary for this paper yet.