Equilibrium Computation in Multi-Stage Auctions and Contests (2312.11751v2)
Abstract: We compute equilibrium strategies in multi-stage games with continuous signal and action spaces as they are widely used in the management sciences and economics. Examples include sequential sales via auctions, multi-stage elimination contests, and Stackelberg competitions. In sequential auctions, analysts performing equilibrium analysis are required to derive not just single bids but bid functions for all possible signals or values that a bidder might have in multiple stages. Due to the continuity of the signal and action spaces, these bid functions come from an infinite dimensional space. While such models are fundamental to game theory and its applications, equilibrium strategies are rarely known. The resulting system of non-linear differential equations is considered intractable for all but elementary models. This has been limiting progress in game theory and is a barrier to its adoption in the field. We show that Deep Reinforcement Learning and self-play can learn equilibrium bidding strategies for various multi-stage games. We find equilibrium in models that have not yet been explored analytically and new asymmetric equilibrium bid functions for established models of sequential auctions. The verification of equilibrium is challenging in such games due to the continuous signal and action spaces. We introduce a verification algorithm and prove that the error of this verifier decreases when considering Lipschitz continuous strategies with increasing levels of discretization and sample sizes.
- Learning in matrix games can be arbitrarily complex. Conference on Learning Theory. PMLR, 159–185.
- Simultaneous vs. sequential price competition with incomplete information. Economics Letters 104(1) 23–26. 10.1016/j.econlet.2009.03.017.
- Robust learning equilibrium. Proceedings of the Twenty-Second Conference on Uncertainty in Artificial Intelligence. UAI’06, AUAI Press, Arlington, Virginia, USA, 7–14.
- Askoura, Y. 2019. An infinite dimensional purification principle without saturation. Journal of Mathematical Analysis and Applications 472(2) 1331–1345. 10.1016/j.jmaa.2018.11.078.
- Multiplicative weights update in zero-sum games. Proceedings of the 2018 ACM Conference on Economics and Computation. ACM, 321–338.
- The Hanabi Challenge: A New Frontier for AI Research. arXiv:1902.00506 [cs, stat] .
- Learning equilibria in symmetric auction games using artificial neural networks. Nature Machine Intelligence 3(8) 687–695.
- On the (non-)monotonicity of auction games. TUM Working Paper 10.
- Bikhchandani, Sushil. 1988. Reputation in repeated second-price auctions. Journal of Economic Theory 46(1) 97–119.
- Computing bayes-nash equilibria in combinatorial auctions with verification. Journal of Artificial Intelligence Research 69 531–570.
- Deep counterfactual regret minimization. International conference on machine learning. PMLR, 793–802.
- Discrete time dynamic game model for price competition in an oligopoly. Annals of Operations Research 97(1) 69–89.
- Auctions with unique equilibria. Proceedings of the Fourteenth ACM Conference on Electronic Commerce. EC ’13, Association for Computing Machinery, New York, NY, USA, 181–196. 10.1145/2482540.2483188.
- Chaos, extremism and optimism: Volume analysis of learning in games. Advances in Neural Information Processing Systems 33 9039–9049.
- Contest theory. Handbook of game theory and industrial organization 2 125–146.
- On learning algorithms for Nash equilibria. International Symposium on Algorithmic Game Theory. Springer, 114–125.
- Dowrick, Steve. 1986. Von Stackelberg and Cournot Duopoly: Choosing Roles. The RAND Journal of Economics 17(2) 251–260. 10.2307/2555388.
- Numerical simulations of asymmetric first-price auctions. Games and Economic Behavior 73(2) 479–495.
- The theory of learning in games, vol. 2. MIT press.
- Glicksberg, Irving L. 1952. A further generalization of the kakutani fixed point theorem, with application to nash equilibrium points. Proceedings of the American Mathematical Society 3(1) 170–174.
- Bidding algorithms for simultaneous auctions. Proceedings of the 3rd ACM Conference on Electronic Commerce. 115–124.
- Uncoupled dynamics do not lead to nash equilibrium. American Economic Review 93(5) 1830–1836.
- Hausch, Donald B. 1986. Multi-object auctions: Sequential vs. simultaneous sales. Management Science 32(12) 1599–1610.
- Hornik, Kurt. 1991. Approximation capabilities of multilayer feedforward networks. Neural Networks 4(2) 251–257.
- Horst, Ulrich. 2005. Stationary equilibria in discounted stochastic games with weakly interacting players. Games and Economic Behavior 51(1) 83–108. 10.1016/j.geb.2004.03.003.
- On the approximate purification of mixed strategies in games with infinite action sets. Economic Theory Bulletin 10(1) 69–93. 10.1007/s40505-022-00219-1.
- Accelerating best response calculation in large extensive games. Twenty-second international joint conference on artificial intelligence.
- Katzman, Brett. 1999. A two stage sequential auction with multi-unit demands. Journal of Economic Theory 86(1) 77–99.
- Self-normalizing neural networks. Proceedings of the 31st international conference on neural information processing systems. 972–981.
- Klemperer, Paul. 2000. Why every economist should learn some auction theory. SSRN 241350 .
- Enabling first-order gradient-based learning for equilibrium computation in markets. Fourtieth International Conference on Machine Learning. Proceedings of Machine Learning Research, PMLR, Honolulu, Hawaii.
- Strategy and dynamics in contests. OUP Catalogue .
- Krishna, Vijay. 2009. Auction theory. Academic press.
- Li, Lode. 1985. Cournot Oligopoly with Information Sharing. The RAND Journal of Economics 16(4) 521. 10.2307/2555510.
- Policy-gradient algorithms have no guarantees of convergence in linear quadratic games. arXiv preprint arXiv:1907.03712 .
- Cycles in adversarial regularized learning. Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms. SIAM, 2703–2717.
- Mezzetti, Claudio. 2011. Sequential auctions with informational externalities and aversion to price risk: decreasing and increasing price sequences. The Economic Journal 121(555) 990–1016.
- A theory of auctions and competitive bidding II. International Library of Critical Writings in Economics 113 179–194.
- A theory of auctions and competitive bidding. Econometrica: Journal of the Econometric Society 1089–1122.
- Distributional strategies for games with incomplete information. Mathematics of Operations Research 10(4) 619–632.
- Nash, conley, and computation: Impossibility and incompleteness in game dynamics. Tech. rep., arXiv preprint arXiv:2203.14129.
- Monte carlo gradient estimation in machine learning. J. Mach. Learn. Res. 21(132) 1–62.
- Contest architecture. Journal of Economic Theory 126(1) 70–96.
- Perfect conditional ϵitalic-ϵ\epsilonitalic_ϵ-equilibria of multi-stage games with infinite sets of signals and actions. Econometrica 88(2) 495–531.
- Using iterated best-response to find bayes-nash equilibria in auctions. PROCEEDINGS OF THE NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, vol. 22. Menlo Park, CA; Cambridge, MA; London; AAAI Press; MIT Press; 1999, 1894.
- Equilibrium points in n-person games. Proceedings of the National Academy of Sciences 36(1) 48–49.
- Powell, Robert. 2007. Allocating Defensive Resources with Private Information about Vulnerability. The American Political Science Review 101(4) 799–809.
- Computing pure bayesian-nash equilibria in games with finite actions and continuous types. Artificial Intelligence 195 106–139.
- Stable-Baselines3: Reliable Reinforcement Learning Implementations. Journal of Machine Learning Research 22(268) 1–8. URL http://jmlr.org/papers/v22/20-1364.html.
- Reny, Philip J. 1999. On the existence of pure and mixed strategy nash equilibria in discontinuous games. Econometrica 67(5) 1029–1056.
- Reny, Philip J. 2011. On the existence of monotone pure-strategy equilibria in Bayesian games. Econometrica 79(2) 499–553.
- Rosato, Antonio. 2023. Loss aversion in sequential auctions. Theoretical Economics 18(2) 561–596.
- The prevalence of chaotic dynamics in games with many players. Scientific Reports 8(1) 1–13.
- Proximal policy optimization algorithms. ArXiv abs/1707.06347.
- Actor-critic policy optimization in partially observable multiagent environments. Advances in neural information processing systems 31.
- Reinforcement Learning: An Introduction. Second edition ed. Adaptive Computation and Machine Learning Series, The MIT Press, Cambridge, Massachusetts.
- Trifunović, Dejan. 2014. Sequential auctions and price anomalies. Economic Annals 59(200) 7–42.
- A generalised method for empirical game theoretic analysis. arXiv:1803.06376 [cs] .
- Ui, Takashi. 2016. Bayesian nash equilibrium and variational inequalities. Journal of Mathematical Economics 63 139–146.
- Vickrey, William. 1961. Counterspeculation, auctions, and competitive sealed tenders. Journal of Finance 16(1) 8–37.
- Villani, Cédric. 2009. Optimal Transport, Grundlehren Der Mathematischen Wissenschaften, vol. 338. Springer, Berlin, Heidelberg. 10.1007/978-3-540-71050-9.
- No-regret learning and mixed nash equilibria: They do not mix. Annual Conference on Neural Information Processing Systems.
- Vojnović, Milan. 2015. Contest theory: Incentive mechanisms and ranking methods. Cambridge University Press.
- Von Stackelberg, Heinrich. 2011. Market Structure and Equilibrium. Springer Berlin Heidelberg, Berlin, Heidelberg. 10.1007/978-3-642-12586-7.
- Choosing samples to compute heuristic-strategy nash equilibrium. Agent-Mediated Electronic Commerce V. Designing Mechanisms and Systems: AAMAS 2003 Workshop, AMEC 2003, Melbourne, Australia, July 15, 2003, Revised Selected Papers 5. Springer, 109–123.
- Wilson, Robert. 1966. Competitive bidding with disparate information. 114, Graduate School of Business, Stanford University.
- An overview of multi-agent reinforcement learning from game theoretical perspective. arXiv preprint arXiv:2011.00583 .
- Yildirim, Huseyin. 2005. Contests with multiple rounds. Games and Economic Behavior 51(1) 213–227.
- Young, H Peyton. 2004. Strategic learning and its limits. OUP Oxford.
- The surprising effectiveness of ppo in cooperative multi-agent games. Advances in Neural Information Processing Systems 35 24611–24624.
- Zhang, Jun. 2008. Simultaneous signaling in elimination contests. Tech. rep., Queen’s Economics Department Working Paper.
- Multi-agent reinforcement learning: A selective overview of theories and algorithms. Handbook of Reinforcement Learning and Control 321–384.
- Regret minimization in games with incomplete information. Advances in neural information processing systems 20.
- Fabian R. Pieroth (5 papers)
- Nils Kohring (3 papers)
- Martin Bichler (29 papers)