Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Strategy Complexity of Point Payoff, Mean Payoff and Total Payoff Objectives in Countable MDPs (2203.07079v4)

Published 10 Mar 2022 in cs.CC, cs.AI, cs.GT, and math.PR

Abstract: We study countably infinite Markov decision processes (MDPs) with real-valued transition rewards. Every infinite run induces the following sequences of payoffs: 1. Point payoff (the sequence of directly seen transition rewards), 2. Mean payoff (the sequence of the sums of all rewards so far, divided by the number of steps), and 3. Total payoff (the sequence of the sums of all rewards so far). For each payoff type, the objective is to maximize the probability that the $\liminf$ is non-negative. We establish the complete picture of the strategy complexity of these objectives, i.e., how much memory is necessary and sufficient for $\varepsilon$-optimal (resp. optimal) strategies. Some cases can be won with memoryless deterministic strategies, while others require a step counter, a reward counter, or both.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (34)
  1. Qualitative analysis of VASS-induced MDPs. In International Conference on Foundations of Software Science and Computational Structures (FoSSaCS), volume 9634, 2016.
  2. Reachability and safety objectives in Markov decision processes on long but finite horizons. Journal of Optimization Theory and Applications, 185:945–965, 2020.
  3. Learning first-order Markov models for control. In Advances in Neural Information Processing Systems 17, pages 1–8. MIT Press, 2004. URL: http://papers.nips.cc/paper/2569-learning-first-order-markov-models-for-control.
  4. One-counter Markov decision processes. In ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 863–874. Society for Industrial and Applied Mathematics, 2010.
  5. Approximating the Termination Value of One-Counter MDPs and Stochastic Games. Information and Computation, 222:121–138, 2013.
  6. P. Billingsley. Probability and Measure. Wiley, New York, NY, 1995. Third Edition.
  7. Principles of Model Checking. MIT Press, 2008.
  8. Markov Decision Processes with Applications to Finance. Springer-Verlag Berlin Heidelberg, 2011.
  9. T.J.I’A. Bromwich. An Introduction to the Theory of Infinite Series. McMillan and Company, London, 1955.
  10. A survey of computational complexity results in systems and control. Automatica, 36(9):1249–1274, 2000.
  11. Games and Markov decision processes with mean-payoff parity and energy parity objectives. In Proc. of MEMICS, volume 7119 of LNCS, pages 37–46. Springer, 2011.
  12. A survey of stochastic games with limsup and liminf objectives. In Proc. of ICALP, volume 5556 of LNCS. Springer, 2009.
  13. Model Checking. MIT Press, Dec. 1999.
  14. Handbook of Model Checking. Springer, 2018. doi:10.1007/978-3-319-10575-8.
  15. How to Gamble If You Must: Inequalities for Stochastic Processes. Dover Publications Inc., 2014.
  16. Quasi-birth–death processes, tree-like QBDs, probabilistic 1-counter automata, and pushdown systems. Performance Evaluation, 67(9):837–857, 2010.
  17. Recursive Markov decision processes and recursive stochastic games. Journal of the ACM, 62:1–69, 2015.
  18. Simplifying optimal strategies in limsup and liminf stochastic games. Discrete Applied Mathematics, 251:40–56, 2018.
  19. The existence of good Markov strategies for decision processes with general payoffs. Stoch. Processes and Appl., 24:61–76, 1987.
  20. Büchi objectives in countable MDPs. In ICALP, volume 132 of LIPIcs, pages 119:1–119:14, 2019. Full version at https://arxiv.org/abs/1904.11573. doi:10.4230/LIPIcs.ICALP.2019.119.
  21. Strategy Complexity of Parity Objectives in Countable MDPs. In International Conference on Concurrency Theory (CONCUR), volume 171 of LIPIcs, 2020. doi:10.4230/LIPIcs.CONCUR.2020.7.
  22. Transience in countable MDPs. In Proc. of CONCUR, volume 203 of LIPIcs, 2021. Full version at https://arxiv.org/abs/2012.13739.
  23. Parity Objectives in Countable MDPs. In LICS. IEEE, 2017. doi:10.1109/LICS.2017.8005100.
  24. Discrete Gambling and Stochastic Games. Springer-Verlag, 1996.
  25. A. Nowak. Advances in dynamic games : applications to economics, finance, optimization, and stochastic control. Birkhaeuser, Boston, 2005.
  26. D. Ornstein. On the existence of stationary optimal strategies. Proceedings of the American Mathematical Society, 20:563–569, 1969.
  27. Martin L. Puterman. Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons, Inc., New York, NY, USA, 1st edition, 1994.
  28. Stochastic games and related topics: in honor of Professor LS Shapley, volume 7. Springer Science & Business Media, 2012.
  29. S. M. Ross. Introduction to Stochastic Dynamic Programming. Academic Press, New York, 1983.
  30. Markov Decision Processes in Artificial Intelligence. John Wiley & Sons, 2013.
  31. Reinforcement Learning: An Introduction. Adaptive Computation and Machine Learning. MIT Press, 2018.
  32. Manfred Schäl. Markov decision processes in finance and dynamic options. In Handbook of Markov Decision Processes, pages 461–487. Springer, 2002.
  33. William D. Sudderth. Optimal Markov strategies. Decisions in Economics and Finance, 43:43–54, 2020.
  34. M.Y. Vardi. Automatic verification of probabilistic concurrent finite-state programs. In Proc. of FOCS’85, pages 327–338, 1985.
Citations (2)

Summary

We haven't generated a summary for this paper yet.