Transience in Countable MDPs

Kiefer, Stefan; Mayr, Richard; Shirmohammadi, Mahsa; Totzke, Patrick

doi:10.4230/LIPIcs.CONCUR.2021.11

Abstract

The Transience objective is not to visit any state infinitely often. While this is not possible in any finite Markov Decision Process (MDP), it can be satisfied in countably infinite ones, e.g., if the transition graph is acyclic.
We prove the following fundamental properties of Transience in countably infinite MDPs.  
1) There exist uniformly ε-optimal MD strategies (memoryless deterministic) for Transience, even in infinitely branching MDPs. 
2) Optimal strategies for Transience need not exist, even if the MDP is finitely branching. However, if an optimal strategy exists then there is also an optimal MD strategy. 
3) If an MDP is universally transient (i.e., almost surely transient under all strategies) then many other objectives have a lower strategy complexity than in general MDPs. E.g., ε-optimal strategies for Safety and co-Büchi and optimal strategies for {0,1,2}-Parity (where they exist) can be chosen MD, even if the MDP is infinitely branching.

Pieter Abbeel and Andrew Y. Ng. Learning first-order Markov models for control. In Advances in Neural Information Processing Systems 17. MIT Press, 2004. URL: http://papers.nips.cc/paper/2569-learning-first-order-markov-models-for-control.
Galit Ashkenazi-Golan, János Flesch, Arkadi Predtetchinski, and Eilon Solan. Reachability and safety objectives in Markov decision processes on long but finite horizons. Journal of Optimization Theory and Applications, 2020.
Christel Baier and Joost-Pieter Katoen. Principles of Model Checking. MIT Press, 2008.
Patrick Billingsley. Probability and Measure. Wiley, 1995. Third Edition.
Vincent D. Blondel and John N. Tsitsiklis. A survey of computational complexity results in systems and control. Automatica, 2000.
Nicole Bäuerle and Ulrich Rieder. Markov Decision Processes with Applications to Finance. Springer-Verlag Berlin Heidelberg, 2011.
K. Chatterjee and T. Henzinger. A survey of stochastic ω-regular games. Journal of Computer and System Sciences, 2012.
Edmund M. Clarke, Thomas A. Henzinger, Helmut Veith, and Roderick Bloem, editors. Handbook of Model Checking. Springer, 2018. URL: http://dx.doi.org/10.1007/978-3-319-10575-8.
E.M. Clarke, O. Grumberg, and D. Peled. Model Checking. MIT Press, December 1999.
William Feller. An Introduction to Probability Theory and Its Applications. Wiley & Sons, second edition, 1966.
János Flesch, Arkadi Predtetchinski, and William Sudderth. Simplifying optimal strategies in limsup and liminf stochastic games. Discrete Applied Mathematics, 2018.
T.P. Hill and V.C. Pestien. The existence of good Markov strategies for decision processes with general payoffs. Stoch. Processes and Appl., 1987.
S. Kiefer, R. Mayr, M. Shirmohammadi, and P. Totzke. Transience in countable MDPs. In International Conference on Concurrency Theory, LIPIcs, 2021. Full version at URL: https://arxiv.org/abs/2012.13739.
Stefan Kiefer, Richard Mayr, Mahsa Shirmohammadi, and Patrick Totzke. Büchi objectives in countable MDPs. In International Colloquium on Automata, Languages and Programming, LIPIcs, 2019. Full version at https://arxiv.org/abs/1904.11573. URL: http://dx.doi.org/10.4230/LIPIcs.ICALP.2019.119.
Stefan Kiefer, Richard Mayr, Mahsa Shirmohammadi, and Patrick Totzke. Strategy Complexity of Parity Objectives in Countable MDPs. In International Conference on Concurrency Theory, 2020. URL: http://dx.doi.org/10.4230/LIPIcs.CONCUR.2020.7.
Stefan Kiefer, Richard Mayr, Mahsa Shirmohammadi, Patrick Totzke, and Dominik Wojtczak. How to play in infinite MDPs (invited talk). In International Colloquium on Automata, Languages and Programming, 2020. URL: http://dx.doi.org/10.4230/LIPIcs.ICALP.2020.3.
Stefan Kiefer, Richard Mayr, Mahsa Shirmohammadi, and Dominik Wojtczak. Parity Objectives in Countable MDPs. In Annual IEEE Symposium on Logic in Computer Science, 2017. URL: http://dx.doi.org/10.1109/LICS.2017.8005100.
Richard Mayr and Eric Munday. Strategy Complexity of Mean Payoff, Total Payoff and Point Payoff Objectives in Countable MDPs. In International Conference on Concurrency Theory, LIPIcs, 2021. The full version is available at URL: https://arxiv.org/abs/2107.03287.
A. Mostowski. Regular expressions for infinite trees and a standard form of automata. In Computation Theory, LNCS, 1984.
Donald Ornstein. On the existence of stationary optimal strategies. Proceedings of the American Mathematical Society, 1969. URL: http://dx.doi.org/10.2307/2035700.
Martin L. Puterman. Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons, Inc., 1st edition, 1994.
George Santayana. Reason in common sense, 1905. In Volume 1 of The Life of Reason. URL: https://en.wikipedia.org/wiki/George_Santayana.
Manfred Schäl. Markov decision processes in finance and dynamic options. In Handbook of Markov Decision Processes. Springer, 2002.
Olivier Sigaud and Olivier Buffet. Markov Decision Processes in Artificial Intelligence. John Wiley & Sons, 2013.
William D. Sudderth. Optimal Markov strategies. Decisions in Economics and Finance, 2020.
R.S. Sutton and A.G Barto. Reinforcement Learning: An Introduction. Adaptive Computation and Machine Learning. MIT Press, 2018.
Moshe Y. Vardi. Automatic verification of probabilistic concurrent finite-state programs. In Annual Symposium on Foundations of Computer Science. IEEE Computer Society, 1985. URL: http://dx.doi.org/10.1109/SFCS.1985.12.

Transience in Countable MDPs

Authors Stefan Kiefer, Richard Mayr, Mahsa Shirmohammadi, Patrick Totzke

File

Document Identifiers

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

References

Thanks for your feedback!

Could not send message

Transience in Countable MDPs

Authors Stefan Kiefer, Richard Mayr, Mahsa Shirmohammadi, Patrick Totzke

File

Document Identifiers

Related Versions

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

References

Thanks for your feedback!

Could not send message