Transience in Countable MDPs

Kiefer, Stefan; Mayr, Richard; Shirmohammadi, Mahsa; Totzke, Patrick

doi:10.4230/LIPIcs.CONCUR.2021.11

File

Author Details

Stefan Kiefer

Department of Computer Science, University of Oxford, UK

Richard Mayr

School of Informatics, University of Edinburgh, UK

Mahsa Shirmohammadi

Université de Paris, CNRS, IRIF, F-75013 Paris, France

Patrick Totzke

Department of Computer Science, University of Liverpool, UK

Cite AsGet BibTex

Stefan Kiefer, Richard Mayr, Mahsa Shirmohammadi, and Patrick Totzke. Transience in Countable MDPs. In 32nd International Conference on Concurrency Theory (CONCUR 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 203, pp. 11:1-11:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)
https://doi.org/10.4230/LIPIcs.CONCUR.2021.11

Abstract

The Transience objective is not to visit any state infinitely often. While this is not possible in any finite Markov Decision Process (MDP), it can be satisfied in countably infinite ones, e.g., if the transition graph is acyclic. We prove the following fundamental properties of Transience in countably infinite MDPs. 1) There exist uniformly ε-optimal MD strategies (memoryless deterministic) for Transience, even in infinitely branching MDPs. 2) Optimal strategies for Transience need not exist, even if the MDP is finitely branching. However, if an optimal strategy exists then there is also an optimal MD strategy. 3) If an MDP is universally transient (i.e., almost surely transient under all strategies) then many other objectives have a lower strategy complexity than in general MDPs. E.g., ε-optimal strategies for Safety and co-Büchi and optimal strategies for {0,1,2}-Parity (where they exist) can be chosen MD, even if the MDP is infinitely branching.

Subject Classification

ACM Subject Classification

Theory of computation → Random walks and Markov chains
Mathematics of computing → Probability and statistics

Keywords

Markov decision processes
Parity
Transience

Metrics

Access Statistics
Total Accesses (updated on a weekly basis)

0

PDF Downloads

0

Metadata Views

References

Pieter Abbeel and Andrew Y. Ng. Learning first-order Markov models for control. In Advances in Neural Information Processing Systems 17. MIT Press, 2004. URL: http://papers.nips.cc/paper/2569-learning-first-order-markov-models-for-control.
Galit Ashkenazi-Golan, János Flesch, Arkadi Predtetchinski, and Eilon Solan. Reachability and safety objectives in Markov decision processes on long but finite horizons. Journal of Optimization Theory and Applications, 2020.
Christel Baier and Joost-Pieter Katoen. Principles of Model Checking. MIT Press, 2008.
Patrick Billingsley. Probability and Measure. Wiley, 1995. Third Edition.
Vincent D. Blondel and John N. Tsitsiklis. A survey of computational complexity results in systems and control. Automatica, 2000.
Nicole Bäuerle and Ulrich Rieder. Markov Decision Processes with Applications to Finance. Springer-Verlag Berlin Heidelberg, 2011.
K. Chatterjee and T. Henzinger. A survey of stochastic ω-regular games. Journal of Computer and System Sciences, 2012.
Edmund M. Clarke, Thomas A. Henzinger, Helmut Veith, and Roderick Bloem, editors. Handbook of Model Checking. Springer, 2018. URL: http://dx.doi.org/10.1007/978-3-319-10575-8.
E.M. Clarke, O. Grumberg, and D. Peled. Model Checking. MIT Press, December 1999.
William Feller. An Introduction to Probability Theory and Its Applications. Wiley & Sons, second edition, 1966.
János Flesch, Arkadi Predtetchinski, and William Sudderth. Simplifying optimal strategies in limsup and liminf stochastic games. Discrete Applied Mathematics, 2018.
T.P. Hill and V.C. Pestien. The existence of good Markov strategies for decision processes with general payoffs. Stoch. Processes and Appl., 1987.
S. Kiefer, R. Mayr, M. Shirmohammadi, and P. Totzke. Transience in countable MDPs. In International Conference on Concurrency Theory, LIPIcs, 2021. Full version at URL: https://arxiv.org/abs/2012.13739.
Stefan Kiefer, Richard Mayr, Mahsa Shirmohammadi, and Patrick Totzke. Büchi objectives in countable MDPs. In International Colloquium on Automata, Languages and Programming, LIPIcs, 2019. Full version at https://arxiv.org/abs/1904.11573. URL: http://dx.doi.org/10.4230/LIPIcs.ICALP.2019.119.
Stefan Kiefer, Richard Mayr, Mahsa Shirmohammadi, and Patrick Totzke. Strategy Complexity of Parity Objectives in Countable MDPs. In International Conference on Concurrency Theory, 2020. URL: http://dx.doi.org/10.4230/LIPIcs.CONCUR.2020.7.
Stefan Kiefer, Richard Mayr, Mahsa Shirmohammadi, Patrick Totzke, and Dominik Wojtczak. How to play in infinite MDPs (invited talk). In International Colloquium on Automata, Languages and Programming, 2020. URL: http://dx.doi.org/10.4230/LIPIcs.ICALP.2020.3.
Stefan Kiefer, Richard Mayr, Mahsa Shirmohammadi, and Dominik Wojtczak. Parity Objectives in Countable MDPs. In Annual IEEE Symposium on Logic in Computer Science, 2017. URL: http://dx.doi.org/10.1109/LICS.2017.8005100.
Richard Mayr and Eric Munday. Strategy Complexity of Mean Payoff, Total Payoff and Point Payoff Objectives in Countable MDPs. In International Conference on Concurrency Theory, LIPIcs, 2021. The full version is available at URL: https://arxiv.org/abs/2107.03287.
A. Mostowski. Regular expressions for infinite trees and a standard form of automata. In Computation Theory, LNCS, 1984.
Donald Ornstein. On the existence of stationary optimal strategies. Proceedings of the American Mathematical Society, 1969. URL: http://dx.doi.org/10.2307/2035700.
Martin L. Puterman. Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons, Inc., 1st edition, 1994.
George Santayana. Reason in common sense, 1905. In Volume 1 of The Life of Reason. URL: https://en.wikipedia.org/wiki/George_Santayana.
Manfred Schäl. Markov decision processes in finance and dynamic options. In Handbook of Markov Decision Processes. Springer, 2002.
Olivier Sigaud and Olivier Buffet. Markov Decision Processes in Artificial Intelligence. John Wiley & Sons, 2013.
William D. Sudderth. Optimal Markov strategies. Decisions in Economics and Finance, 2020.
R.S. Sutton and A.G Barto. Reinforcement Learning: An Introduction. Adaptive Computation and Machine Learning. MIT Press, 2018.
Moshe Y. Vardi. Automatic verification of probabilistic concurrent finite-state programs. In Annual Symposium on Foundations of Computer Science. IEEE Computer Society, 1985. URL: http://dx.doi.org/10.1109/SFCS.1985.12.

Transience in Countable MDPs

Authors Stefan Kiefer, Richard Mayr, Mahsa Shirmohammadi, Patrick Totzke

File

Document Identifiers

Author Details

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

References

Thanks for your feedback!

Could not send message

Transience in Countable MDPs

Authors Stefan Kiefer, Richard Mayr, Mahsa Shirmohammadi, Patrick Totzke

File

Document Identifiers

Author Details

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

Related Versions

References

Thanks for your feedback!

Could not send message