Büchi Objectives in Countable MDPs (Track B: Automata, Logic, Semantics, and Theory of Programming)

Kiefer, Stefan; Mayr, Richard; Shirmohammadi, Mahsa; Totzke, Patrick

doi:10.4230/LIPIcs.ICALP.2019.119

Abstract

We study countably infinite Markov decision processes with Büchi objectives, which ask to visit a given subset F of states infinitely often. A question left open by T.P. Hill in 1979 [Theodore Preston Hill, 1979] is whether there always exist epsilon-optimal Markov strategies, i.e., strategies that base decisions only on the current state and the number of steps taken so far. We provide a negative answer to this question by constructing a non-trivial counterexample. On the other hand, we show that Markov strategies with only 1 bit of extra memory are sufficient.

Cite As Get BibTex

Stefan Kiefer, Richard Mayr, Mahsa Shirmohammadi, and Patrick Totzke. Büchi Objectives in Countable MDPs (Track B: Automata, Logic, Semantics, and Theory of Programming). In 46th International Colloquium on Automata, Languages, and Programming (ICALP 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 132, pp. 119:1-119:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019) https://doi.org/10.4230/LIPIcs.ICALP.2019.119

Author Details

Stefan Kiefer

University of Oxford, UK

Richard Mayr

University of Edinburgh, UK

Mahsa Shirmohammadi

CNRS, Paris, France
IRIF, Paris, France

Patrick Totzke

University of Liverpool, UK

Funding

Kiefer, Stefan: Supported by a Royal Society University Research Fellowship.
Mayr, Richard: Supported by EPSRC grant EP/M027651/1.
Shirmohammadi, Mahsa: Supported by PEPS JCJC grant AAPS.

Acknowledgements

The authors thank anonymous reviewers for their helpful comments.

References

Pieter Abbeel and Andrew Y. Ng. Learning first-order Markov models for control. In Advances in Neural Information Processing Systems 17, pages 1-8. MIT Press, 2004.
Christel Baier and Joost-Pieter Katoen. Principles of Model Checking. MIT Press, 2008.
P. Billingsley. Probability and Measure. Wiley, 1995. Third Edition.
Vincent D. Blondel and John N. Tsitsiklis. A survey of computational complexity results in systems and control. Automatica, 36(9):1249-1274, 2000.
Nicole Bäuerle and Ulrich Rieder. Markov Decision Processes with Applications to Finance. Springer-Verlag Berlin Heidelberg, 2011.
K. Chatterjee, L. de Alfaro, and T. Henzinger. Trading memory for randomness. In Annual Conference on Quantitative Evaluation of Systems, pages 206-217. IEEE Computer Society Press, 2004.
K. Chatterjee and T. Henzinger. A survey of stochastic ω-regular games. Journal of Computer and System Sciences, 78(2):394-413, 2012.
K. Chatterjee, M. Jurdziński, and T. Henzinger. Quantitative Stochastic Parity Games. In Annual ACM-SIAM Symposium on Discrete Algorithms, pages 121-130. Society for Industrial and Applied Mathematics, 2004.
Edmund M. Clarke, Thomas A. Henzinger, Helmut Veith, and Roderick Bloem, editors. Handbook of Model Checking. Springer, 2018.
Theodore Preston Hill. On the Existence of Good Markov strategies. Transactions of the American Mathematical Society, 247:157-176, 1979.
Theodore Preston Hill. Goal Problems in Gambling Theory. Revista de Matemática: Teoría y Aplicaciones, 6(2):125-132, 1999.
Stefan Kiefer, Richard Mayr, Mahsa Shirmohammadi, and Patrick Totzke. Büchi Objectives in Countable MDPs. Technical report, arxiv.org, 2019. URL: http://arxiv.org/abs/1904.11573.
Stefan Kiefer, Richard Mayr, Mahsa Shirmohammadi, and Dominik Wojtczak. Parity Objectives in Countable MDPs. In Annual IEEE Symposium on Logic in Computer Science, 2017.
J. Krčál. Determinacy and Optimal Strategies in Stochastic Games. Master’s thesis, Masaryk University, School of Informatics, 2009.
D. Ornstein. On the existence of stationary optimal strategies. Proceedings of the American Mathematical Society, 20:563-569, 1969.
Martin L. Puterman. Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley &Sons, Inc., 1st edition, 1994.
Manfred Schäl. Markov decision processes in finance and dynamic options. In Handbook of Markov Decision Processes, pages 461-487. Springer, 2002.
Olivier Sigaud and Olivier Buffet. Markov Decision Processes in Artificial Intelligence. John Wiley &Sons, 2013.
R.S. Sutton and A.G Barto. Reinforcement Learning: An Introduction. Adaptive Computation and Machine Learning. MIT Press, 2018.

Büchi Objectives in Countable MDPs (Track B: Automata, Logic, Semantics, and Theory of Programming)

Authors Stefan Kiefer, Richard Mayr, Mahsa Shirmohammadi, Patrick Totzke

File

Document Identifiers

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

Acknowledgements

References

Thanks for your feedback!

Could not send message

Büchi Objectives in Countable MDPs (Track B: Automata, Logic, Semantics, and Theory of Programming)

Authors Stefan Kiefer, Richard Mayr, Mahsa Shirmohammadi, Patrick Totzke

File

Document Identifiers

Related Versions

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

Funding

Acknowledgements

References

Thanks for your feedback!

Could not send message