DROPS

Document

DOI: 10.4230/LIPIcs.CONCUR.2023.35

Deciding What Is Good-For-MDPs

Authors: Sven Schewe, Qiyi Tang, and Tansholpan Zhanabekova

Published in: LIPIcs, Volume 279, 34th International Conference on Concurrency Theory (CONCUR 2023)

Abstract

Nondeterministic good-for-MDPs (GFM) automata are for MDP model checking and reinforcement learning what good-for-games automata are for reactive synthesis: a more compact alternative to deterministic automata that displays nondeterminism, but only so much that it can be resolved locally, such that a syntactic product can be analysed. GFM has recently been introduced as a property for reinforcement learning, where the simpler Büchi acceptance conditions it allows to use is key. However, while there are classic and novel techniques to obtain automata that are GFM, there has not been a decision procedure for checking whether or not an automaton is GFM. We show that GFM-ness is decidable and provide an EXPTIME decision procedure as well as a PSPACE-hardness proof.

Cite as

Sven Schewe, Qiyi Tang, and Tansholpan Zhanabekova. Deciding What Is Good-For-MDPs. In 34th International Conference on Concurrency Theory (CONCUR 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 279, pp. 35:1-35:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{schewe_et_al:LIPIcs.CONCUR.2023.35,
  author =	{Schewe, Sven and Tang, Qiyi and Zhanabekova, Tansholpan},
  title =	{{Deciding What Is Good-For-MDPs}},
  booktitle =	{34th International Conference on Concurrency Theory (CONCUR 2023)},
  pages =	{35:1--35:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-299-0},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{279},
  editor =	{P\'{e}rez, Guillermo A. and Raskin, Jean-Fran\c{c}ois},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CONCUR.2023.35},
  URN =		{urn:nbn:de:0030-drops-190290},
  doi =		{10.4230/LIPIcs.CONCUR.2023.35},
  annote =	{Keywords: B\"{u}chi automata, Markov Decision Processes, Omega-regular objectives, Reinforcement learning}
}

Search Results

Documents authored by Zhanabekova, Tansholpan

Deciding What Is Good-For-MDPs

Abstract

Cite as