How to Play in Infinite MDPs (Invited Talk)

Kiefer, Stefan; Mayr, Richard; Shirmohammadi, Mahsa; Totzke, Patrick; Wojtczak, Dominik

doi:10.4230/LIPIcs.ICALP.2020.3

File

LIPIcs.ICALP.2020.3.pdf

Filesize: 0.54 MB
18 pages

Document Identifiers

DOI: 10.4230/LIPIcs.ICALP.2020.3
URN: urn:nbn:de:0030-drops-124103

Author Details

Stefan Kiefer

Department of Computer Science, University of Oxford, United Kingdom

Richard Mayr

School of Informatics, University of Edinburgh, United Kingdom

Mahsa Shirmohammadi

CNRS & IRIF, Université de Paris, France

Patrick Totzke

Department of Computer Science, University of Liverpool, United Kingdom

Dominik Wojtczak

Department of Computer Science, University of Liverpool, United Kingdom

Cite AsGet BibTex

Stefan Kiefer, Richard Mayr, Mahsa Shirmohammadi, Patrick Totzke, and Dominik Wojtczak. How to Play in Infinite MDPs (Invited Talk). In 47th International Colloquium on Automata, Languages, and Programming (ICALP 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 168, pp. 3:1-3:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)
https://doi.org/10.4230/LIPIcs.ICALP.2020.3

Abstract

Markov decision processes (MDPs) are a standard model for dynamic systems that exhibit both stochastic and nondeterministic behavior. For MDPs with finite state space it is known that for a wide range of objectives there exist optimal strategies that are memoryless and deterministic. In contrast, if the state space is infinite, optimal strategies may not exist, and optimal or ε-optimal strategies may require (possibly infinite) memory. In this paper we consider qualitative objectives: reachability, safety, (co-)Büchi, and other parity objectives. We aim at giving an introduction to a collection of techniques that allow for the construction of strategies with little or no memory in countably infinite MDPs.

Subject Classification

ACM Subject Classification

Theory of computation → Random walks and Markov chains
Mathematics of computing → Probability and statistics

Keywords

Markov decision processes

Metrics

Access Statistics
Total Accesses (updated on a weekly basis)

0

PDF Downloads

0

Metadata Views

References

P. Abbeel and A. Y. Ng. Learning first-order Markov models for control. In Advances in Neural Information Processing Systems 17, pages 1-8. MIT Press, 2004. URL: http://papers.nips.cc/paper/2569-learning-first-order-markov-models-for-control.
C. Baier and J.-P. Katoen. Principles of Model Checking. MIT Press, 2008.
V. D. Blondel and J. N. Tsitsiklis. A survey of computational complexity results in systems and control. Automatica, 36(9):1249-1274, 2000.
N. Bäuerle and U. Rieder. Markov Decision Processes with Applications to Finance. Springer-Verlag Berlin Heidelberg, 2011.
K. Chatterjee, L. de Alfaro, and T. Henzinger. Trading memory for randomness. In Annual Conference on Quantitative Evaluation of Systems, pages 206-217. IEEE Computer Society Press, 2004.
K. Chatterjee and T. Henzinger. A survey of stochastic ω-regular games. Journal of Computer and System Sciences, 78(2):394-413, 2012.
K. Chatterjee, M. Jurdziński, and T. Henzinger. Quantitative stochastic parity games. In Annual ACM-SIAM Symposium on Discrete Algorithms, pages 121-130, Philadelphia, PA, USA, 2004. Society for Industrial and Applied Mathematics. URL: http://dl.acm.org/citation.cfm?id=982792.982808.
E. M. Clarke, T. A. Henzinger, H. Veith, and R. Bloem, editors. Handbook of Model Checking. Springer, 2018. URL: https://doi.org/10.1007/978-3-319-10575-8.
W. Feller. An Introduction to Probability Theory and Its Applications, volume 1. Wiley & Sons, second edition, 1966.
S. Kiefer, R. Mayr, M. Shirmohammadi, and P. Totzke. Büchi objectives in countable MDPs. In ICALP 2019, volume 132. LIPIcs, 2019. URL: https://doi.org/10.4230/LIPIcs.ICALP.2019.119.
S. Kiefer, R. Mayr, M. Shirmohammadi, and D. Wojtczak. Parity Objectives in Countable MDPs. In Annual IEEE Symposium on Logic in Computer Science, 2017.
D. Ornstein. On the existence of stationary optimal strategies. Proceedings of the American Mathematical Society, 20:563-569, 1969.
M. L. Puterman. Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons, Inc., New York, NY, USA, 1st edition, 1994.
M. Schäl. Markov decision processes in finance and dynamic options. In Handbook of Markov Decision Processes, pages 461-487. Springer, 2002.
O. Sigaud and O. Buffet. Markov Decision Processes in Artificial Intelligence. John Wiley & Sons, 2013.
R.S. Sutton and A.G Barto. Reinforcement Learning: An Introduction. Adaptive Computation and Machine Learning. MIT Press, 2018.
W. Zielonka. Perfect-information stochastic parity games. In Foundations of Software Science and Computation Structures, pages 499-513. Springer, 2004.

How to Play in Infinite MDPs (Invited Talk)

Authors Stefan Kiefer, Richard Mayr, Mahsa Shirmohammadi, Patrick Totzke, Dominik Wojtczak

File

Document Identifiers

Author Details

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

References

Thanks for your feedback!

Could not send message

How to Play in Infinite MDPs (Invited Talk)

Authors Stefan Kiefer, Richard Mayr, Mahsa Shirmohammadi, Patrick Totzke, Dominik Wojtczak

File

Document Identifiers

Author Details

Funding

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

References

Thanks for your feedback!

Could not send message