Deciding What Is Good-For-MDPs

Schewe, Sven; Tang, Qiyi; Zhanabekova, Tansholpan

doi:10.4230/LIPIcs.CONCUR.2023.35

File

LIPIcs.CONCUR.2023.35.pdf

Filesize: 1.21 MB
16 pages

Document Identifiers

DOI: 10.4230/LIPIcs.CONCUR.2023.35
URN: urn:nbn:de:0030-drops-190290

Author Details

Sven Schewe

University of Liverpool, UK

Qiyi Tang

University of Liverpool, UK

Tansholpan Zhanabekova

University of Liverpool, UK

Acknowledgements

We thank the anonymous reviewers of this paper for their constructive feedback. We thank an anonymous reviewer for raising the excellent question (to the original version) of whether or not QGFM and GFM are different. They proved not to be. Without their clever question, we would not have considered this question, and thus not strengthened this paper accordingly.

Cite AsGet BibTex

Sven Schewe, Qiyi Tang, and Tansholpan Zhanabekova. Deciding What Is Good-For-MDPs. In 34th International Conference on Concurrency Theory (CONCUR 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 279, pp. 35:1-35:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)
https://doi.org/10.4230/LIPIcs.CONCUR.2023.35

Abstract

Nondeterministic good-for-MDPs (GFM) automata are for MDP model checking and reinforcement learning what good-for-games automata are for reactive synthesis: a more compact alternative to deterministic automata that displays nondeterminism, but only so much that it can be resolved locally, such that a syntactic product can be analysed. GFM has recently been introduced as a property for reinforcement learning, where the simpler Büchi acceptance conditions it allows to use is key. However, while there are classic and novel techniques to obtain automata that are GFM, there has not been a decision procedure for checking whether or not an automaton is GFM. We show that GFM-ness is decidable and provide an EXPTIME decision procedure as well as a PSPACE-hardness proof.

Subject Classification

ACM Subject Classification

Theory of computation → Automata over infinite objects
Mathematics of computing → Markov processes

Keywords

Büchi automata
Markov Decision Processes
Omega-regular objectives
Reinforcement learning

Metrics

Access Statistics
Total Accesses (updated on a weekly basis)

0

PDF Downloads

0

Metadata Views

References

Luca Alfaro. Formal Verification of Probabilistic Systems. PhD thesis, Stanford University, Stanford, CA, USA, 1998.
Marc Bagnol and Denis Kuperberg. Büchi good-for-games automata are efficiently recognizable. In FSTTCS, 2018.
Christel Baier and Joost-Pieter Katoen. Principles of Model Checking. MIT Press, 2008.
Andrea Bianco and Luca de Alfaro. Model checking of probabalistic and nondeterministic systems. In P. S. Thiagarajan, editor, Foundations of Software Technology and Theoretical Computer Science, 15th Conference, Bangalore, India, December 18-20, 1995, Proceedings, volume 1026 of Lecture Notes in Computer Science, pages 499-513. Springer, 1995.
Thomas Colcombet. Forms of determinism for automata (invited talk). In Christoph Dürr and Thomas Wilke, editors, 29th International Symposium on Theoretical Aspects of Computer Science, STACS 2012, February 29th - March 3rd, 2012, Paris, France, volume 14 of LIPIcs, pages 1-23. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2012.
Costas Courcoubetis and Mihalis Yannakakis. The complexity of probabilistic verification. J. ACM, 42(4):857-907, July 1995.
Ernst Moritz Hahn, Guangyuan Li, Sven Schewe, Andrea Turrini, and Lijun Zhang. Lazy probabilistic model checking without determinisation. In Concurrency Theory, pages 354-367, 2015.
Ernst Moritz Hahn, Mateo Perez, Sven Schewe, Fabio Somenzi, Ashutosh Trivedi, and Dominik Wojtczak. Omega-regular objectives in model-free reinforcement learning. In Tomáš Vojnar and Lijun Zhang, editors, Tools and Algorithms for the Construction and Analysis of Systems, pages 395-412, Cham, 2019. Springer International Publishing.
Ernst Moritz Hahn, Mateo Perez, Sven Schewe, Fabio Somenzi, Ashutosh Trivedi, and Dominik Wojtczak. Good-for-MDPs automata for probabilistic analysis and reinforcement learning. In Armin Biere and David Parker, editors, Tools and Algorithms for the Construction and Analysis of Systems, pages 306-323, Cham, 2020. Springer International Publishing.
Thomas A. Henzinger and Nir Piterman. Solving games without determinization. In Computer Science Logic, pages 394-409, September 2006. LNCS 4207.
Denis Kuperberg and Michał Skrzypczak. On determinisation of good-for-games automata. In Magnús M. Halldórsson, Kazuo Iwama, Naoki Kobayashi, and Bettina Speckmann, editors, Automata, Languages, and Programming, pages 299-310, Berlin, Heidelberg, 2015. Springer Berlin Heidelberg.
Dominique Perrin and Jean-Éric Pin. Infinite Words: Automata, Semigroups, Logic and Games. Elsevier, 2004.
Nir Piterman. From Nondeterministic Büchi and Streett Automata to Deterministic Parity Automata. Log. Methods Comput. Sci., 3(3), 2007.
Amir Pnueli. The temporal logic of programs. In IEEE Symposium on Foundations of Computer Science, pages 46-57, 1977.
Sven Schewe. Synthesis for probabilistic environments. In 4th International Symposium on Automated Technology for Verification and Analysis (ATVA 2006), pages 245-259. Springer Verlag, 2006.
Sven Schewe, Qiyi Tang, and Tansholpan Zhanabekova. Deciding what is good-for-mdps, 2023. URL: https://arxiv.org/abs/2202.07629.
Salomon Sickert, Javier Esparza, Stefan Jaax, and Jan Křetínský. Limit-deterministic Büchi automata for linear temporal logic. In Computer Aided Verification, pages 312-332, 2016. LNCS 9780.
Larry J. Stockmeyer and Albert R. Meyer. Word problems requiring exponential time: Preliminary report. In Alfred V. Aho, Allan Borodin, Robert L. Constable, Robert W. Floyd, Michael A. Harrison, Richard M. Karp, and H. Raymond Strong, editors, Proceedings of the 5th Annual ACM Symposium on Theory of Computing, April 30 - May 2, 1973, Austin, Texas, USA, pages 1-9. ACM, 1973.
Richard S. Sutton and Andrew G. Barto. Reinforcement Learning: An Introduction. MIT Press, second edition, 2018.
Wolfgang Thomas. Handbook of Theoretical Computer Science, chapter Automata on Infinite Objects, pages 133-191. The MIT Press/ [0]Elsevier, 1990.
Moshe Y. Vardi. Automatic verification of probabilistic concurrent finite state programs. In Foundations of Computer Science, pages 327-338, 1985.