Transience in Countable MDPs

Authors Stefan Kiefer, Richard Mayr, Mahsa Shirmohammadi, Patrick Totzke



PDF
Thumbnail PDF

File

LIPIcs.CONCUR.2021.11.pdf
  • Filesize: 0.72 MB
  • 15 pages

Document Identifiers

Author Details

Stefan Kiefer
  • Department of Computer Science, University of Oxford, UK
Richard Mayr
  • School of Informatics, University of Edinburgh, UK
Mahsa Shirmohammadi
  • Université de Paris, CNRS, IRIF, F-75013 Paris, France
Patrick Totzke
  • Department of Computer Science, University of Liverpool, UK

Cite AsGet BibTex

Stefan Kiefer, Richard Mayr, Mahsa Shirmohammadi, and Patrick Totzke. Transience in Countable MDPs. In 32nd International Conference on Concurrency Theory (CONCUR 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 203, pp. 11:1-11:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)
https://doi.org/10.4230/LIPIcs.CONCUR.2021.11

Abstract

The Transience objective is not to visit any state infinitely often. While this is not possible in any finite Markov Decision Process (MDP), it can be satisfied in countably infinite ones, e.g., if the transition graph is acyclic. We prove the following fundamental properties of Transience in countably infinite MDPs. 1) There exist uniformly ε-optimal MD strategies (memoryless deterministic) for Transience, even in infinitely branching MDPs. 2) Optimal strategies for Transience need not exist, even if the MDP is finitely branching. However, if an optimal strategy exists then there is also an optimal MD strategy. 3) If an MDP is universally transient (i.e., almost surely transient under all strategies) then many other objectives have a lower strategy complexity than in general MDPs. E.g., ε-optimal strategies for Safety and co-Büchi and optimal strategies for {0,1,2}-Parity (where they exist) can be chosen MD, even if the MDP is infinitely branching.

Subject Classification

ACM Subject Classification
  • Theory of computation → Random walks and Markov chains
  • Mathematics of computing → Probability and statistics
Keywords
  • Markov decision processes
  • Parity
  • Transience

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Pieter Abbeel and Andrew Y. Ng. Learning first-order Markov models for control. In Advances in Neural Information Processing Systems 17. MIT Press, 2004. URL: http://papers.nips.cc/paper/2569-learning-first-order-markov-models-for-control.
  2. Galit Ashkenazi-Golan, János Flesch, Arkadi Predtetchinski, and Eilon Solan. Reachability and safety objectives in Markov decision processes on long but finite horizons. Journal of Optimization Theory and Applications, 2020. Google Scholar
  3. Christel Baier and Joost-Pieter Katoen. Principles of Model Checking. MIT Press, 2008. Google Scholar
  4. Patrick Billingsley. Probability and Measure. Wiley, 1995. Third Edition. Google Scholar
  5. Vincent D. Blondel and John N. Tsitsiklis. A survey of computational complexity results in systems and control. Automatica, 2000. Google Scholar
  6. Nicole Bäuerle and Ulrich Rieder. Markov Decision Processes with Applications to Finance. Springer-Verlag Berlin Heidelberg, 2011. Google Scholar
  7. K. Chatterjee and T. Henzinger. A survey of stochastic ω-regular games. Journal of Computer and System Sciences, 2012. Google Scholar
  8. Edmund M. Clarke, Thomas A. Henzinger, Helmut Veith, and Roderick Bloem, editors. Handbook of Model Checking. Springer, 2018. URL: http://dx.doi.org/10.1007/978-3-319-10575-8.
  9. E.M. Clarke, O. Grumberg, and D. Peled. Model Checking. MIT Press, December 1999. Google Scholar
  10. William Feller. An Introduction to Probability Theory and Its Applications. Wiley & Sons, second edition, 1966. Google Scholar
  11. János Flesch, Arkadi Predtetchinski, and William Sudderth. Simplifying optimal strategies in limsup and liminf stochastic games. Discrete Applied Mathematics, 2018. Google Scholar
  12. T.P. Hill and V.C. Pestien. The existence of good Markov strategies for decision processes with general payoffs. Stoch. Processes and Appl., 1987. Google Scholar
  13. S. Kiefer, R. Mayr, M. Shirmohammadi, and P. Totzke. Transience in countable MDPs. In International Conference on Concurrency Theory, LIPIcs, 2021. Full version at URL: https://arxiv.org/abs/2012.13739.
  14. Stefan Kiefer, Richard Mayr, Mahsa Shirmohammadi, and Patrick Totzke. Büchi objectives in countable MDPs. In International Colloquium on Automata, Languages and Programming, LIPIcs, 2019. Full version at https://arxiv.org/abs/1904.11573. URL: http://dx.doi.org/10.4230/LIPIcs.ICALP.2019.119.
  15. Stefan Kiefer, Richard Mayr, Mahsa Shirmohammadi, and Patrick Totzke. Strategy Complexity of Parity Objectives in Countable MDPs. In International Conference on Concurrency Theory, 2020. URL: http://dx.doi.org/10.4230/LIPIcs.CONCUR.2020.7.
  16. Stefan Kiefer, Richard Mayr, Mahsa Shirmohammadi, Patrick Totzke, and Dominik Wojtczak. How to play in infinite MDPs (invited talk). In International Colloquium on Automata, Languages and Programming, 2020. URL: http://dx.doi.org/10.4230/LIPIcs.ICALP.2020.3.
  17. Stefan Kiefer, Richard Mayr, Mahsa Shirmohammadi, and Dominik Wojtczak. Parity Objectives in Countable MDPs. In Annual IEEE Symposium on Logic in Computer Science, 2017. URL: http://dx.doi.org/10.1109/LICS.2017.8005100.
  18. Richard Mayr and Eric Munday. Strategy Complexity of Mean Payoff, Total Payoff and Point Payoff Objectives in Countable MDPs. In International Conference on Concurrency Theory, LIPIcs, 2021. The full version is available at URL: https://arxiv.org/abs/2107.03287.
  19. A. Mostowski. Regular expressions for infinite trees and a standard form of automata. In Computation Theory, LNCS, 1984. Google Scholar
  20. Donald Ornstein. On the existence of stationary optimal strategies. Proceedings of the American Mathematical Society, 1969. URL: http://dx.doi.org/10.2307/2035700.
  21. Martin L. Puterman. Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons, Inc., 1st edition, 1994. Google Scholar
  22. George Santayana. Reason in common sense, 1905. In Volume 1 of The Life of Reason. URL: https://en.wikipedia.org/wiki/George_Santayana.
  23. Manfred Schäl. Markov decision processes in finance and dynamic options. In Handbook of Markov Decision Processes. Springer, 2002. Google Scholar
  24. Olivier Sigaud and Olivier Buffet. Markov Decision Processes in Artificial Intelligence. John Wiley & Sons, 2013. Google Scholar
  25. William D. Sudderth. Optimal Markov strategies. Decisions in Economics and Finance, 2020. Google Scholar
  26. R.S. Sutton and A.G Barto. Reinforcement Learning: An Introduction. Adaptive Computation and Machine Learning. MIT Press, 2018. Google Scholar
  27. Moshe Y. Vardi. Automatic verification of probabilistic concurrent finite-state programs. In Annual Symposium on Foundations of Computer Science. IEEE Computer Society, 1985. URL: http://dx.doi.org/10.1109/SFCS.1985.12.