Of Cores: A Partial-Exploration Framework for Markov Decision Processes

Křetínský, Jan; Meggendorfer, Tobias

doi:10.4230/LIPIcs.CONCUR.2019.5

File

Author Details

Jan Křetínský

Technical University of Munich, Germany

Tobias Meggendorfer

Technical University of Munich, Germany

Cite AsGet BibTex

Jan Křetínský and Tobias Meggendorfer. Of Cores: A Partial-Exploration Framework for Markov Decision Processes. In 30th International Conference on Concurrency Theory (CONCUR 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 140, pp. 5:1-5:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)
https://doi.org/10.4230/LIPIcs.CONCUR.2019.5

Abstract

We introduce a framework for approximate analysis of Markov decision processes (MDP) with bounded-, unbounded-, and infinite-horizon properties. The main idea is to identify a "core" of an MDP, i.e., a subsystem where we provably remain with high probability, and to avoid computation on the less relevant rest of the state space. Although we identify the core using simulations and statistical techniques, it allows for rigorous error bounds in the analysis. Consequently, we obtain efficient analysis algorithms based on partial exploration for various settings, including the challenging case of strongly connected systems.

Subject Classification

ACM Subject Classification

Theory of computation → Verification by model checking
Theory of computation → Random walks and Markov chains

Keywords

Markov Decision Processes
Reachability
Approximation

Metrics

Access Statistics
Total Accesses (updated on a weekly basis)

0

PDF Downloads

0

Metadata Views

References

Pranav Ashok, Yuliya Butkova, Holger Hermanns, and Jan Křetínskỳ. Continuous-Time Markov Decisions Based on Partial Exploration. In ATVA, pages 317-334. Springer, 2018.
Pranav Ashok, Krishnendu Chatterjee, Przemyslaw Daca, Jan Kretínský, and Tobias Meggendorfer. Value Iteration for Long-Run Average Reward in Markov Decision Processes. In CAV, pages 201-221, 2017. URL: http://dx.doi.org/10.1007/978-3-319-63387-9_10.
Christel Baier and Joost-Pieter Katoen. Principles of Model Checking. MIT Press, 2008.
Richard Bellman. A Markovian decision process. Journal of Mathematics and Mechanics, pages 679-684, 1957.
Dimitri P. Bertsekas. Dynamic Programming and Optimal Control, Vol. II: Approximate Dynamic Programming. Athena Scientific, 2012.
Tomás Brázdil, Krishnendu Chatterjee, Martin Chmelik, Vojtech Forejt, Jan Křetínský, Marta Z. Kwiatkowska, David Parker, and Mateusz Ujma. Verification of Markov Decision Processes Using Learning Algorithms. In ATVA, pages 98-114. Springer, 2014. URL: http://dx.doi.org/10.1007/978-3-319-11936-6_8.
Costas Courcoubetis and Mihalis Yannakakis. The Complexity of Probabilistic Verification. J. ACM, 42(4):857-907, 1995. URL: http://dx.doi.org/10.1145/210332.210339.
Pedro R. D'Argenio, Bertrand Jeannet, Henrik Ejersbo Jensen, and Kim Guldstrand Larsen. Reduction and Refinement Strategies for Probabilistic Analysis. In PAPM-PROBMIV, pages 57-76. Springer, 2002.
Christian Dehnert, Sebastian Junges, Joost-Pieter Katoen, and Matthias Volk. A storm is coming: A modern probabilistic model checker. In CAV, pages 592-600. Springer, 2017.
Serge Haddad and Benjamin Monmege. Reachability in MDPs: Refining convergence of value iteration. In International Workshop on Reachability Problems, pages 125-137. Springer, 2014.
Ernst Moritz Hahn, Holger Hermanns, Björn Wachter, and Lijun Zhang. PASS: abstraction refinement for infinite probabilistic models. In TACAS, pages 353-357. Springer, 2010.
Jan Křetínský and Tobias Meggendorfer. Of Cores: A Partial-Exploration Framework for Markov Decision Processes. arXiv e-prints, June 2019. URL: http://arxiv.org/abs/1906.06931.
M. Kwiatkowska, G. Norman, and D. Parker. PRISM: Probabilistic symbolic model checker. In TOOLS, pages 200-204, 2002.
Marta Kwiatkowska, Gethin Norman, David Parker, and Jeremy Sproston. Performance analysis of probabilistic timed automata using digital clocks. FMSD, 29(1):33-78, 2006.
Marta Kwiatkowska, Gethin Norman, and Jeremy Sproston. Probabilistic model checking of the IEEE 802.11 wireless local area network protocol. In Process Algebra and Probabilistic Methods: Performance Modeling and Verification, pages 169-187. Springer, 2002.
Marta Z. Kwiatkowska, Gethin Norman, and David Parker. The PRISM Benchmark Suite. In QEST, pages 203-204. IEEE Computer Society, 2012. The models are accessible at URL: http://www.prismmodelchecker.org/casestudies/.
H Brendan McMahan, Maxim Likhachev, and Geoffrey J Gordon. Bounded real-time dynamic programming: RTDP with monotone upper bounds and performance guarantees. In ICML, pages 569-576. ACM, 2005.
M.L. Puterman. Markov decision processes: Discrete stochastic dynamic programming. John Wiley and Sons, 1994.
Tim Quatmann and Joost-Pieter Katoen. Sound Value Iteration. In CAV (1), volume 10981 of LNCS, pages 643-661. Springer, 2018.
Aaron Sidford, Mengdi Wang, Xian Wu, and Yinyu Ye. Variance Reduced Value Iteration and Faster Algorithms for Solving Markov Decision Processes. In SODA, pages 770-787. SIAM, 2018.

Of Cores: A Partial-Exploration Framework for Markov Decision Processes

Authors Jan Křetínský , Tobias Meggendorfer

File

Document Identifiers

Author Details

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

References

Thanks for your feedback!

Could not send message

Of Cores: A Partial-Exploration Framework for Markov Decision Processes

Authors Jan Křetínský , Tobias Meggendorfer

File

Document Identifiers

Author Details

Funding

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

Related Versions

References

Thanks for your feedback!

Could not send message