Deciding What Is Good-For-MDPs

Authors Sven Schewe , Qiyi Tang , Tansholpan Zhanabekova



PDF
Thumbnail PDF

File

LIPIcs.CONCUR.2023.35.pdf
  • Filesize: 1.21 MB
  • 16 pages

Document Identifiers

Author Details

Sven Schewe
  • University of Liverpool, UK
Qiyi Tang
  • University of Liverpool, UK
Tansholpan Zhanabekova
  • University of Liverpool, UK

Acknowledgements

We thank the anonymous reviewers of this paper for their constructive feedback. We thank an anonymous reviewer for raising the excellent question (to the original version) of whether or not QGFM and GFM are different. They proved not to be. Without their clever question, we would not have considered this question, and thus not strengthened this paper accordingly.

Cite AsGet BibTex

Sven Schewe, Qiyi Tang, and Tansholpan Zhanabekova. Deciding What Is Good-For-MDPs. In 34th International Conference on Concurrency Theory (CONCUR 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 279, pp. 35:1-35:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)
https://doi.org/10.4230/LIPIcs.CONCUR.2023.35

Abstract

Nondeterministic good-for-MDPs (GFM) automata are for MDP model checking and reinforcement learning what good-for-games automata are for reactive synthesis: a more compact alternative to deterministic automata that displays nondeterminism, but only so much that it can be resolved locally, such that a syntactic product can be analysed. GFM has recently been introduced as a property for reinforcement learning, where the simpler Büchi acceptance conditions it allows to use is key. However, while there are classic and novel techniques to obtain automata that are GFM, there has not been a decision procedure for checking whether or not an automaton is GFM. We show that GFM-ness is decidable and provide an EXPTIME decision procedure as well as a PSPACE-hardness proof.

Subject Classification

ACM Subject Classification
  • Theory of computation → Automata over infinite objects
  • Mathematics of computing → Markov processes
Keywords
  • Büchi automata
  • Markov Decision Processes
  • Omega-regular objectives
  • Reinforcement learning

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Luca Alfaro. Formal Verification of Probabilistic Systems. PhD thesis, Stanford University, Stanford, CA, USA, 1998. Google Scholar
  2. Marc Bagnol and Denis Kuperberg. Büchi good-for-games automata are efficiently recognizable. In FSTTCS, 2018. Google Scholar
  3. Christel Baier and Joost-Pieter Katoen. Principles of Model Checking. MIT Press, 2008. Google Scholar
  4. Andrea Bianco and Luca de Alfaro. Model checking of probabalistic and nondeterministic systems. In P. S. Thiagarajan, editor, Foundations of Software Technology and Theoretical Computer Science, 15th Conference, Bangalore, India, December 18-20, 1995, Proceedings, volume 1026 of Lecture Notes in Computer Science, pages 499-513. Springer, 1995. Google Scholar
  5. Thomas Colcombet. Forms of determinism for automata (invited talk). In Christoph Dürr and Thomas Wilke, editors, 29th International Symposium on Theoretical Aspects of Computer Science, STACS 2012, February 29th - March 3rd, 2012, Paris, France, volume 14 of LIPIcs, pages 1-23. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2012. Google Scholar
  6. Costas Courcoubetis and Mihalis Yannakakis. The complexity of probabilistic verification. J. ACM, 42(4):857-907, July 1995. Google Scholar
  7. Ernst Moritz Hahn, Guangyuan Li, Sven Schewe, Andrea Turrini, and Lijun Zhang. Lazy probabilistic model checking without determinisation. In Concurrency Theory, pages 354-367, 2015. Google Scholar
  8. Ernst Moritz Hahn, Mateo Perez, Sven Schewe, Fabio Somenzi, Ashutosh Trivedi, and Dominik Wojtczak. Omega-regular objectives in model-free reinforcement learning. In Tomáš Vojnar and Lijun Zhang, editors, Tools and Algorithms for the Construction and Analysis of Systems, pages 395-412, Cham, 2019. Springer International Publishing. Google Scholar
  9. Ernst Moritz Hahn, Mateo Perez, Sven Schewe, Fabio Somenzi, Ashutosh Trivedi, and Dominik Wojtczak. Good-for-MDPs automata for probabilistic analysis and reinforcement learning. In Armin Biere and David Parker, editors, Tools and Algorithms for the Construction and Analysis of Systems, pages 306-323, Cham, 2020. Springer International Publishing. Google Scholar
  10. Thomas A. Henzinger and Nir Piterman. Solving games without determinization. In Computer Science Logic, pages 394-409, September 2006. LNCS 4207. Google Scholar
  11. Denis Kuperberg and Michał Skrzypczak. On determinisation of good-for-games automata. In Magnús M. Halldórsson, Kazuo Iwama, Naoki Kobayashi, and Bettina Speckmann, editors, Automata, Languages, and Programming, pages 299-310, Berlin, Heidelberg, 2015. Springer Berlin Heidelberg. Google Scholar
  12. Dominique Perrin and Jean-Éric Pin. Infinite Words: Automata, Semigroups, Logic and Games. Elsevier, 2004. Google Scholar
  13. Nir Piterman. From Nondeterministic Büchi and Streett Automata to Deterministic Parity Automata. Log. Methods Comput. Sci., 3(3), 2007. Google Scholar
  14. Amir Pnueli. The temporal logic of programs. In IEEE Symposium on Foundations of Computer Science, pages 46-57, 1977. Google Scholar
  15. Sven Schewe. Synthesis for probabilistic environments. In 4th International Symposium on Automated Technology for Verification and Analysis (ATVA 2006), pages 245-259. Springer Verlag, 2006. Google Scholar
  16. Sven Schewe, Qiyi Tang, and Tansholpan Zhanabekova. Deciding what is good-for-mdps, 2023. URL: https://arxiv.org/abs/2202.07629.
  17. Salomon Sickert, Javier Esparza, Stefan Jaax, and Jan Křetínský. Limit-deterministic Büchi automata for linear temporal logic. In Computer Aided Verification, pages 312-332, 2016. LNCS 9780. Google Scholar
  18. Larry J. Stockmeyer and Albert R. Meyer. Word problems requiring exponential time: Preliminary report. In Alfred V. Aho, Allan Borodin, Robert L. Constable, Robert W. Floyd, Michael A. Harrison, Richard M. Karp, and H. Raymond Strong, editors, Proceedings of the 5th Annual ACM Symposium on Theory of Computing, April 30 - May 2, 1973, Austin, Texas, USA, pages 1-9. ACM, 1973. Google Scholar
  19. Richard S. Sutton and Andrew G. Barto. Reinforcement Learning: An Introduction. MIT Press, second edition, 2018. Google Scholar
  20. Wolfgang Thomas. Handbook of Theoretical Computer Science, chapter Automata on Infinite Objects, pages 133-191. The MIT Press/ [0]Elsevier, 1990. Google Scholar
  21. Moshe Y. Vardi. Automatic verification of probabilistic concurrent finite state programs. In Foundations of Computer Science, pages 327-338, 1985. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail