Of Cores: A Partial-Exploration Framework for Markov Decision Processes

Authors Jan Křetínský , Tobias Meggendorfer



PDF
Thumbnail PDF

File

LIPIcs.CONCUR.2019.5.pdf
  • Filesize: 0.55 MB
  • 17 pages

Document Identifiers

Author Details

Jan Křetínský
  • Technical University of Munich, Germany
Tobias Meggendorfer
  • Technical University of Munich, Germany

Cite As Get BibTex

Jan Křetínský and Tobias Meggendorfer. Of Cores: A Partial-Exploration Framework for Markov Decision Processes. In 30th International Conference on Concurrency Theory (CONCUR 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 140, pp. 5:1-5:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019) https://doi.org/10.4230/LIPIcs.CONCUR.2019.5

Abstract

We introduce a framework for approximate analysis of Markov decision processes (MDP) with bounded-, unbounded-, and infinite-horizon properties. The main idea is to identify a "core" of an MDP, i.e., a subsystem where we provably remain with high probability, and to avoid computation on the less relevant rest of the state space. Although we identify the core using simulations and statistical techniques, it allows for rigorous error bounds in the analysis. Consequently, we obtain efficient analysis algorithms based on partial exploration for various settings, including the challenging case of strongly connected systems.

Subject Classification

ACM Subject Classification
  • Theory of computation → Verification by model checking
  • Theory of computation → Random walks and Markov chains
Keywords
  • Markov Decision Processes
  • Reachability
  • Approximation

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Pranav Ashok, Yuliya Butkova, Holger Hermanns, and Jan Křetínskỳ. Continuous-Time Markov Decisions Based on Partial Exploration. In ATVA, pages 317-334. Springer, 2018. Google Scholar
  2. Pranav Ashok, Krishnendu Chatterjee, Przemyslaw Daca, Jan Kretínský, and Tobias Meggendorfer. Value Iteration for Long-Run Average Reward in Markov Decision Processes. In CAV, pages 201-221, 2017. URL: http://dx.doi.org/10.1007/978-3-319-63387-9_10.
  3. Christel Baier and Joost-Pieter Katoen. Principles of Model Checking. MIT Press, 2008. Google Scholar
  4. Richard Bellman. A Markovian decision process. Journal of Mathematics and Mechanics, pages 679-684, 1957. Google Scholar
  5. Dimitri P. Bertsekas. Dynamic Programming and Optimal Control, Vol. II: Approximate Dynamic Programming. Athena Scientific, 2012. Google Scholar
  6. Tomás Brázdil, Krishnendu Chatterjee, Martin Chmelik, Vojtech Forejt, Jan Křetínský, Marta Z. Kwiatkowska, David Parker, and Mateusz Ujma. Verification of Markov Decision Processes Using Learning Algorithms. In ATVA, pages 98-114. Springer, 2014. URL: http://dx.doi.org/10.1007/978-3-319-11936-6_8.
  7. Costas Courcoubetis and Mihalis Yannakakis. The Complexity of Probabilistic Verification. J. ACM, 42(4):857-907, 1995. URL: http://dx.doi.org/10.1145/210332.210339.
  8. Pedro R. D'Argenio, Bertrand Jeannet, Henrik Ejersbo Jensen, and Kim Guldstrand Larsen. Reduction and Refinement Strategies for Probabilistic Analysis. In PAPM-PROBMIV, pages 57-76. Springer, 2002. Google Scholar
  9. Christian Dehnert, Sebastian Junges, Joost-Pieter Katoen, and Matthias Volk. A storm is coming: A modern probabilistic model checker. In CAV, pages 592-600. Springer, 2017. Google Scholar
  10. Serge Haddad and Benjamin Monmege. Reachability in MDPs: Refining convergence of value iteration. In International Workshop on Reachability Problems, pages 125-137. Springer, 2014. Google Scholar
  11. Ernst Moritz Hahn, Holger Hermanns, Björn Wachter, and Lijun Zhang. PASS: abstraction refinement for infinite probabilistic models. In TACAS, pages 353-357. Springer, 2010. Google Scholar
  12. Jan Křetínský and Tobias Meggendorfer. Of Cores: A Partial-Exploration Framework for Markov Decision Processes. arXiv e-prints, June 2019. URL: http://arxiv.org/abs/1906.06931.
  13. M. Kwiatkowska, G. Norman, and D. Parker. PRISM: Probabilistic symbolic model checker. In TOOLS, pages 200-204, 2002. Google Scholar
  14. Marta Kwiatkowska, Gethin Norman, David Parker, and Jeremy Sproston. Performance analysis of probabilistic timed automata using digital clocks. FMSD, 29(1):33-78, 2006. Google Scholar
  15. Marta Kwiatkowska, Gethin Norman, and Jeremy Sproston. Probabilistic model checking of the IEEE 802.11 wireless local area network protocol. In Process Algebra and Probabilistic Methods: Performance Modeling and Verification, pages 169-187. Springer, 2002. Google Scholar
  16. Marta Z. Kwiatkowska, Gethin Norman, and David Parker. The PRISM Benchmark Suite. In QEST, pages 203-204. IEEE Computer Society, 2012. The models are accessible at URL: http://www.prismmodelchecker.org/casestudies/.
  17. H Brendan McMahan, Maxim Likhachev, and Geoffrey J Gordon. Bounded real-time dynamic programming: RTDP with monotone upper bounds and performance guarantees. In ICML, pages 569-576. ACM, 2005. Google Scholar
  18. M.L. Puterman. Markov decision processes: Discrete stochastic dynamic programming. John Wiley and Sons, 1994. Google Scholar
  19. Tim Quatmann and Joost-Pieter Katoen. Sound Value Iteration. In CAV (1), volume 10981 of LNCS, pages 643-661. Springer, 2018. Google Scholar
  20. Aaron Sidford, Mengdi Wang, Xian Wu, and Yinyu Ye. Variance Reduced Value Iteration and Faster Algorithms for Solving Markov Decision Processes. In SODA, pages 770-787. SIAM, 2018. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail