Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Learning in Online Principal-Agent Interactions: The Power of Menus (2312.09869v2)

Published 15 Dec 2023 in cs.GT and cs.LG

Abstract: We study a ubiquitous learning challenge in online principal-agent problems during which the principal learns the agent's private information from the agent's revealed preferences in historical interactions. This paradigm includes important special cases such as pricing and contract design, which have been widely studied in recent literature. However, existing work considers the case where the principal can only choose a single strategy at every round to interact with the agent and then observe the agent's revealed preference through their actions. In this paper, we extend this line of study to allow the principal to offer a menu of strategies to the agent and learn additionally from observing the agent's selection from the menu. We provide a thorough investigation of several online principal-agent problem settings and characterize their sample complexities, accompanied by the corresponding algorithms we have developed. We instantiate this paradigm to several important design problems $-$ including Stackelberg (security) games, contract design, and information design. Finally, we also explore the connection between our findings and existing results about online learning in Stackelberg games, and we offer a solution that can overcome a key hard instance of Peng et al. (2019).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (44)
  1. Price of Transparency in Strategic Machine Learning. arXiv, arXiv–1610.
  2. Learning prices for repeated auctions with strategic buyers. In Advances in Neural Information Processing Systems, 1169–1177.
  3. Stackelberg security games (ssg) basics and application overview. Improving Homeland Security Decisions, 485.
  4. Learning economic parameters from revealed preferences. In International Conference on Web and Internet Economics, 338–353. Springer.
  5. Learning from revealed preference. In Proceedings of the 7th ACM Conference on Electronic Commerce, 36–42.
  6. Learning optimal commitment to overcome insecurity. In Advances in Neural Information Processing Systems, 1826–1834.
  7. Stackelberg games for adversarial prediction problems. In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, 547–555.
  8. Online bayesian persuasion. Advances in Neural Information Processing Systems, 33: 16188–16198.
  9. Designing Menus of Contracts Efficiently: The Power of Randomization. arXiv preprint arXiv:2202.10966.
  10. Prediction, learning, and games. Cambridge university press.
  11. Learning an agent’s utility function by observing behavior. In ICML, 35–42.
  12. Learning to Incentivize Information Acquisition: Proper Scoring Rules Meet Principal-Agent Model. In Proceedings of the 40th International Conference on Machine Learning.
  13. Information elicitation for decision making.
  14. The Limits of Optimal Pricing in the Dark. Advances in Neural Information Processing Systems, 34: 26649–26660.
  15. First-Order Convex Fitting and Its Application to Economics and Optimization. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, 6480–6487.
  16. Algorithmic persuasion with no externalities. In Proceedings of the 2017 ACM Conference on Economics and Computation, 351–368.
  17. Simple versus optimal contracts. In Proceedings of the 2019 ACM Conference on Economics and Computation, 369–387.
  18. Optimal Coordination in Generalized Principal-Agent Problems: A Revisit and Extensions. arXiv preprint arXiv:2209.01146.
  19. Robust Stackelberg Equilibria. In Proceedings of the 24th ACM Conference on Economics and Computation, EC ’23, 735. New York, NY, USA: Association for Computing Machinery. ISBN 9798400701047.
  20. Personalized peer truth serum for eliciting multi-attribute personal data. In Uncertainty in Artificial Intelligence, 18–27. PMLR.
  21. Reverse stackelberg games, part i: Basic framework. In 2012 IEEE International Conference on Control Applications, 421–426. IEEE.
  22. The Power of Menus in Contract Design. arXiv preprint arXiv:2306.12667.
  23. Learning in Stackelberg Games with Non-myopic Agents. In Proceedings of the 23rd ACM Conference on Economics and Computation, 917–918.
  24. Algorithmic Persuasion Through Simulation: Information Design in the Age of Generative AI. arXiv preprint arXiv:2311.18138.
  25. The theory of contracts.
  26. Adaptive contract design for crowdsourcing markets: Bandit algorithms for repeated principal-agent problems. In Proceedings of the fifteenth ACM conference on Economics and computation, 359–376.
  27. Kamenica, E. 2019. Bayesian persuasion and information design. Annual Review of Economics, 11: 249–272.
  28. The value of knowing a demand curve: Bounds on regret for online posted-price auctions. In 44th Annual IEEE Symposium on Foundations of Computer Science, 2003. Proceedings., 594–605. IEEE.
  29. Information elicitation mechanisms for statistical estimation. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, 2095–2102.
  30. Learning and approximating the optimal strategy to commit to. In International Symposium on Algorithmic Game Theory, 250–262. Springer.
  31. Optimization of Scoring Rules. In Proceedings of the 23rd ACM Conference on Economics and Computation, EC ’22, 988–989. New York, NY, USA: Association for Computing Machinery. ISBN 9781450391504.
  32. The social cost of strategic classification. In Proceedings of the Conference on Fairness, Accountability, and Transparency, 230–239.
  33. Optimal regret minimization in posted-price auctions with strategic buyers. In Advances in Neural Information Processing Systems, 1871–1879.
  34. Myerson, R. B. 1982. Optimal coordination mechanisms in generalized principal–agent problems. Journal of mathematical economics, 10(1): 67–81.
  35. Learning optimal strategies to commit to. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, 2149–2156.
  36. Savage, L. J. 1971. Elicitation of personal probabilities and expectations. Journal of the American Statistical Association, 66(336): 783–801.
  37. Shalev-Shwartz, S.; et al. 2012. Online learning and online convex optimization. Foundations and Trends® in Machine Learning, 4(2): 107–194.
  38. Stackelberg, H. v. 1934. Marktform und gleichgewicht.
  39. Tambe, M. 2011. Security and game theory: algorithms, deployed systems, lessons learned. Cambridge university press.
  40. Leadership with commitment to mixed strategies. Technical report, Citeseer.
  41. Efficiently learning from revealed preference. In International Workshop on Internet and Network Economics, 114–127. Springer.
  42. Incentive-aware PAC learning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, 5797–5804.
  43. Online Learning in Stackelberg Games with an Omniscient Follower. In Krause, A.; Brunskill, E.; Cho, K.; Engelhardt, B.; Sabato, S.; and Scarlett, J., eds., Proceedings of the 40th International Conference on Machine Learning, volume 202 of Proceedings of Machine Learning Research, 42304–42316. PMLR.
  44. The Sample Complexity of Online Contract Design. arXiv preprint arXiv:2211.05732.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Minbiao Han (9 papers)
  2. Michael Albert (30 papers)
  3. Haifeng Xu (95 papers)
Citations (3)

Summary

We haven't generated a summary for this paper yet.