2000 character limit reached
Of Cores: A Partial-Exploration Framework for Markov Decision Processes (1906.06931v6)
Published 17 Jun 2019 in eess.SY, cs.AI, cs.LO, and cs.SY
Abstract: We introduce a framework for approximate analysis of Markov decision processes (MDP) with bounded-, unbounded-, and infinite-horizon properties. The main idea is to identify a "core" of an MDP, i.e., a subsystem where we provably remain with high probability, and to avoid computation on the less relevant rest of the state space. Although we identify the core using simulations and statistical techniques, it allows for rigorous error bounds in the analysis. Consequently, we obtain efficient analysis algorithms based on partial exploration for various settings, including the challenging case of strongly connected systems.
- Continuous-time Markov decisions based on partial exploration. In ATVA, pages 317–334. Springer, 2018.
- Value iteration for long-run average reward in Markov decision processes. In CAV, pages 201–221, 2017.
- Verification of Markov decision processes using learning algorithms. In ATVA, pages 98–114. Springer, 2014.
- Richard Bellman. A Markovian decision process. Journal of Mathematics and Mechanics, pages 679–684, 1957.
- Dimitri P. Bertsekas. Dynamic Programming and Optimal Control, Vol. II: Approximate Dynamic Programming. Athena Scientific, 2012.
- Principles of Model Checking. MIT Press, 2008.
- Faster and dynamic algorithms for maximal end-component decomposition and related graph problems in probabilistic verification. In SODA, pages 1318–1336, 2011.
- An O(n22{}^{\mbox{2}}start_FLOATSUPERSCRIPT 2 end_FLOATSUPERSCRIPT) time algorithm for alternating büchi games. In SODA, pages 1386–1399. SIAM, 2012.
- Efficient and dynamic algorithms for alternating büchi games and maximal end-component decomposition. J. ACM, 61(3):15:1–15:40, 2014.
- Stochastic invariants for probabilistic termination. In Giuseppe Castagna and Andrew D. Gordon, editors, Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages, POPL 2017, Paris, France, January 18-20, 2017, pages 145–160. ACM, 2017.
- The complexity of probabilistic verification. J. ACM, 42(4):857–907, 1995.
- Luca De Alfaro. Formal verification of probabilistic systems. Number 1601. Citeseer, 1997.
- Reduction and refinement strategies for probabilistic analysis. In PAPM-PROBMIV, pages 57–76. Springer, 2002.
- A storm is coming: A modern probabilistic model checker. In CAV, pages 592–600. Springer, 2017.
- PASS: abstraction refinement for infinite probabilistic models. In TACAS, pages 353–357. Springer, 2010.
- Optimistic value iteration. CoRR, abs/1910.01100, 2019.
- Reachability in MDPs: Refining convergence of value iteration. In International Workshop on Reachability Problems, pages 125–137. Springer, 2014.
- Value iteration for simple stochastic games: Stopping criterion and learning algorithm. In Hana Chockler and Georg Weissenbacher, editors, Computer Aided Verification - 30th International Conference, CAV 2018, Held as Part of the Federated Logic Conference, FloC 2018, Oxford, UK, July 14-17, 2018, Proceedings, Part I, volume 10981 of Lecture Notes in Computer Science, pages 623–642. Springer, 2018.
- Of cores: A partial-exploration framework for markov decision processes. In Wan Fokkink and Rob van Glabbeek, editors, 30th International Conference on Concurrency Theory, CONCUR 2019, August 27-30, 2019, Amsterdam, the Netherlands, volume 140 of LIPIcs, pages 5:1–5:17. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2019.
- PRISM: Probabilistic symbolic model checker. In TOOLS, pages 200–204, 2002.
- The PRISM benchmark suite. In QEST, pages 203–204. IEEE Computer Society, 2012.
- Performance analysis of probabilistic timed automata using digital clocks. FMSD, 29(1):33–78, 2006.
- Probabilistic model checking of the IEEE 802.11 wireless local area network protocol. In Process Algebra and Probabilistic Methods: Performance Modeling and Verification, pages 169–187. Springer, 2002.
- Bounded real-time dynamic programming: RTDP with monotone upper bounds and performance guarantees. In ICML, pages 569–576, 2005.
- M.L. Puterman. Markov decision processes: Discrete stochastic dynamic programming. John Wiley and Sons, 1994.
- Sound value iteration. In CAV (1), volume 10981 of LNCS, pages 643–661. Springer, 2018.
- Variance reduced value iteration and faster algorithms for solving Markov decision processes. In SODA, pages 770–787. SIAM, 2018.
- O. Tange. Gnu parallel - the command-line power tool. ;login: The USENIX Magazine, 36(1):42–47, Feb 2011.
- Robert Tarjan. Depth-first search and linear graph algorithms. SICOMP, 1(2):146–160, 1972.
- Douglas J White. Further real applications of markov decision processes. Interfaces, 18(5):55–61, 1988.
- Douglas J White. A survey of applications of markov decision processes. Journal of the operational research society, 44(11):1073–1096, 1993.