Document Open Access Logo

Strategy Complexity of Mean Payoff, Total Payoff and Point Payoff Objectives in Countable MDPs

Authors Richard Mayr, Eric Munday



PDF
Thumbnail PDF

File

LIPIcs.CONCUR.2021.12.pdf
  • Filesize: 0.71 MB
  • 15 pages

Document Identifiers

Author Details

Richard Mayr
  • University of Edinburgh, UK
Eric Munday
  • University of Edinburgh, UK

Acknowledgements

We thank an anonymous reviewer for very detailed and helpful comments.

Cite AsGet BibTex

Richard Mayr and Eric Munday. Strategy Complexity of Mean Payoff, Total Payoff and Point Payoff Objectives in Countable MDPs. In 32nd International Conference on Concurrency Theory (CONCUR 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 203, pp. 12:1-12:15, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2021)
https://doi.org/10.4230/LIPIcs.CONCUR.2021.12

Abstract

We study countably infinite Markov decision processes (MDPs) with real-valued transition rewards. Every infinite run induces the following sequences of payoffs: 1. Point payoff (the sequence of directly seen transition rewards), 2. Total payoff (the sequence of the sums of all rewards so far), and 3. Mean payoff. For each payoff type, the objective is to maximize the probability that the liminf is non-negative. We establish the complete picture of the strategy complexity of these objectives, i.e., how much memory is necessary and sufficient for ε-optimal (resp. optimal) strategies. Some cases can be won with memoryless deterministic strategies, while others require a step counter, a reward counter, or both.

Subject Classification

ACM Subject Classification
  • Theory of computation → Random walks and Markov chains
  • Mathematics of computing → Probability and statistics
Keywords
  • Markov decision processes
  • Strategy complexity
  • Mean payoff

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Pieter Abbeel and Andrew Y. Ng. Learning first-order Markov models for control. In Advances in Neural Information Processing Systems 17, pages 1-8. MIT Press, 2004. URL: http://papers.nips.cc/paper/2569-learning-first-order-markov-models-for-control.
  2. Galit Ashkenazi-Golan, János Flesch, Arkadi Predtetchinski, and Eilon Solan. Reachability and safety objectives in Markov decision processes on long but finite horizons. Journal of Optimization Theory and Applications, 185:945-965, 2020. Google Scholar
  3. Christel Baier and Joost-Pieter Katoen. Principles of Model Checking. MIT Press, 2008. Google Scholar
  4. P. Billingsley. Probability and Measure. Wiley, New York, NY, 1995. Third Edition. Google Scholar
  5. Vincent D. Blondel and John N. Tsitsiklis. A survey of computational complexity results in systems and control. Automatica, 36(9):1249-1274, 2000. Google Scholar
  6. Nicole Bäuerle and Ulrich Rieder. Markov Decision Processes with Applications to Finance. Springer-Verlag Berlin Heidelberg, 2011. Google Scholar
  7. K. Chatterjee, L. Doyen, and T. Henzinger. A survey of stochastic games with limsup and liminf objectives. In Proc. of ICALP, volume 5556 of LNCS. Springer, 2009. Google Scholar
  8. Krishnendu Chatterjee and Laurent Doyen. Games and Markov decision processes with mean-payoff parity and energy parity objectives. In Proc. of MEMICS, volume 7119 of LNCS, pages 37-46. Springer, 2011. Google Scholar
  9. Edmund M. Clarke, Thomas A. Henzinger, Helmut Veith, and Roderick Bloem, editors. Handbook of Model Checking. Springer, 2018. URL: http://dx.doi.org/10.1007/978-3-319-10575-8.
  10. E.M. Clarke, O. Grumberg, and D. Peled. Model Checking. MIT Press, December 1999. Google Scholar
  11. János Flesch, Arkadi Predtetchinski, and William Sudderth. Simplifying optimal strategies in limsup and liminf stochastic games. Discrete Applied Mathematics, 251:40–56, 2018. Google Scholar
  12. T.P. Hill and V.C. Pestien. The existence of good Markov strategies for decision processes with general payoffs. Stoch. Processes and Appl., 24:61-76, 1987. Google Scholar
  13. S. Kiefer, R. Mayr, M. Shirmohammadi, and P. Totzke. Transience in countable MDPs. In Proc. of CONCUR, volume 203 of LIPIcs, 2021. Full version at URL: https://arxiv.org/abs/2012.13739.
  14. Stefan Kiefer, Richard Mayr, Mahsa Shirmohammadi, and Patrick Totzke. Büchi objectives in countable MDPs. In ICALP, volume 132 of LIPIcs, pages 119:1-119:14, 2019. Full version at https://arxiv.org/abs/1904.11573. URL: http://dx.doi.org/10.4230/LIPIcs.ICALP.2019.119.
  15. Stefan Kiefer, Richard Mayr, Mahsa Shirmohammadi, and Patrick Totzke. Strategy Complexity of Parity Objectives in Countable MDPs. In CONCUR, pages 7:1-:17, 2020. URL: http://dx.doi.org/10.4230/LIPIcs.CONCUR.2020.7.
  16. Stefan Kiefer, Richard Mayr, Mahsa Shirmohammadi, and Dominik Wojtczak. Parity Objectives in Countable MDPs. In LICS. IEEE, 2017. URL: http://dx.doi.org/10.1109/LICS.2017.8005100.
  17. Richard Mayr and Eric Munday. Strategy Complexity of Mean Payoff, Total Payoff and Point Payoff Objectives in Countable MDPs. In Proc. of CONCUR, volume 203 of LIPIcs, 2021. Full version at URL: https://arxiv.org/abs/2107.03287.
  18. Martin L. Puterman. Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons, Inc., New York, NY, USA, 1st edition, 1994. Google Scholar
  19. S. M. Ross. Introduction to Stochastic Dynamic Programming. Academic Press, New York, 1983. Google Scholar
  20. Manfred Schäl. Markov decision processes in finance and dynamic options. In Handbook of Markov Decision Processes, pages 461-487. Springer, 2002. Google Scholar
  21. Olivier Sigaud and Olivier Buffet. Markov Decision Processes in Artificial Intelligence. John Wiley & Sons, 2013. Google Scholar
  22. William D. Sudderth. Optimal Markov strategies. Decisions in Economics and Finance, 43:43–54, 2020. Google Scholar
  23. R.S. Sutton and A.G Barto. Reinforcement Learning: An Introduction. Adaptive Computation and Machine Learning. MIT Press, 2018. Google Scholar
  24. M.Y. Vardi. Automatic verification of probabilistic concurrent finite-state programs. In Proc. of FOCS'85, pages 327-338, 1985. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail