How to Play in Infinite MDPs (Invited Talk)

Kiefer, Stefan; Mayr, Richard; Shirmohammadi, Mahsa; Totzke, Patrick; Wojtczak, Dominik

doi:10.4230/LIPIcs.ICALP.2020.3

Abstract

Markov decision processes (MDPs) are a standard model for dynamic systems that exhibit both stochastic and nondeterministic behavior. For MDPs with finite state space it is known that for a wide range of objectives there exist optimal strategies that are memoryless and deterministic. In contrast, if the state space is infinite, optimal strategies may not exist, and optimal or ε-optimal strategies may require (possibly infinite) memory. In this paper we consider qualitative objectives: reachability, safety, (co-)Büchi, and other parity objectives. We aim at giving an introduction to a collection of techniques that allow for the construction of strategies with little or no memory in countably infinite MDPs.

Cite As Get BibTex

Stefan Kiefer, Richard Mayr, Mahsa Shirmohammadi, Patrick Totzke, and Dominik Wojtczak. How to Play in Infinite MDPs (Invited Talk). In 47th International Colloquium on Automata, Languages, and Programming (ICALP 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 168, pp. 3:1-3:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020) https://doi.org/10.4230/LIPIcs.ICALP.2020.3

Author Details

Stefan Kiefer

Department of Computer Science, University of Oxford, United Kingdom

Richard Mayr

School of Informatics, University of Edinburgh, United Kingdom

Mahsa Shirmohammadi

CNRS & IRIF, Université de Paris, France

Patrick Totzke

Department of Computer Science, University of Liverpool, United Kingdom

Dominik Wojtczak

Department of Computer Science, University of Liverpool, United Kingdom

Funding

Kiefer, Stefan: Supported by a Royal Society Research Fellowship.

References

P. Abbeel and A. Y. Ng. Learning first-order Markov models for control. In Advances in Neural Information Processing Systems 17, pages 1-8. MIT Press, 2004. URL: http://papers.nips.cc/paper/2569-learning-first-order-markov-models-for-control.
C. Baier and J.-P. Katoen. Principles of Model Checking. MIT Press, 2008.
V. D. Blondel and J. N. Tsitsiklis. A survey of computational complexity results in systems and control. Automatica, 36(9):1249-1274, 2000.
N. Bäuerle and U. Rieder. Markov Decision Processes with Applications to Finance. Springer-Verlag Berlin Heidelberg, 2011.
K. Chatterjee, L. de Alfaro, and T. Henzinger. Trading memory for randomness. In Annual Conference on Quantitative Evaluation of Systems, pages 206-217. IEEE Computer Society Press, 2004.
K. Chatterjee and T. Henzinger. A survey of stochastic ω-regular games. Journal of Computer and System Sciences, 78(2):394-413, 2012.
K. Chatterjee, M. Jurdziński, and T. Henzinger. Quantitative stochastic parity games. In Annual ACM-SIAM Symposium on Discrete Algorithms, pages 121-130, Philadelphia, PA, USA, 2004. Society for Industrial and Applied Mathematics. URL: http://dl.acm.org/citation.cfm?id=982792.982808.
E. M. Clarke, T. A. Henzinger, H. Veith, and R. Bloem, editors. Handbook of Model Checking. Springer, 2018. URL: https://doi.org/10.1007/978-3-319-10575-8.
W. Feller. An Introduction to Probability Theory and Its Applications, volume 1. Wiley & Sons, second edition, 1966.
S. Kiefer, R. Mayr, M. Shirmohammadi, and P. Totzke. Büchi objectives in countable MDPs. In ICALP 2019, volume 132. LIPIcs, 2019. URL: https://doi.org/10.4230/LIPIcs.ICALP.2019.119.
S. Kiefer, R. Mayr, M. Shirmohammadi, and D. Wojtczak. Parity Objectives in Countable MDPs. In Annual IEEE Symposium on Logic in Computer Science, 2017.
D. Ornstein. On the existence of stationary optimal strategies. Proceedings of the American Mathematical Society, 20:563-569, 1969.
M. L. Puterman. Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons, Inc., New York, NY, USA, 1st edition, 1994.
M. Schäl. Markov decision processes in finance and dynamic options. In Handbook of Markov Decision Processes, pages 461-487. Springer, 2002.
O. Sigaud and O. Buffet. Markov Decision Processes in Artificial Intelligence. John Wiley & Sons, 2013.
R.S. Sutton and A.G Barto. Reinforcement Learning: An Introduction. Adaptive Computation and Machine Learning. MIT Press, 2018.
W. Zielonka. Perfect-information stochastic parity games. In Foundations of Software Science and Computation Structures, pages 499-513. Springer, 2004.

How to Play in Infinite MDPs (Invited Talk)

Authors Stefan Kiefer, Richard Mayr, Mahsa Shirmohammadi, Patrick Totzke, Dominik Wojtczak

File

Document Identifiers

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

References

Thanks for your feedback!

Could not send message