Finite-Memory Strategies for Almost-Sure Energy-MeanPayoff Objectives in MDPs

Dantam, Mohan; Mayr, Richard

doi:10.4230/LIPIcs.ICALP.2024.133

File

Author Details

Mohan Dantam

School of Informatics, University of Edinburgh, UK

Richard Mayr

School of Informatics, University of Edinburgh, UK

Cite AsGet BibTex

Mohan Dantam and Richard Mayr. Finite-Memory Strategies for Almost-Sure Energy-MeanPayoff Objectives in MDPs. In 51st International Colloquium on Automata, Languages, and Programming (ICALP 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 297, pp. 133:1-133:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)
https://doi.org/10.4230/LIPIcs.ICALP.2024.133

Abstract

We consider finite-state Markov decision processes with the combined Energy-MeanPayoff objective. The controller tries to avoid running out of energy while simultaneously attaining a strictly positive mean payoff in a second dimension. We show that finite memory suffices for almost surely winning strategies for the Energy-MeanPayoff objective. This is in contrast to the closely related Energy-Parity objective, where almost surely winning strategies require infinite memory in general. We show that exponential memory is sufficient (even for deterministic strategies) and necessary (even for randomized strategies) for almost surely winning Energy-MeanPayoff. The upper bound holds even if the strictly positive mean payoff part of the objective is generalized to multidimensional strictly positive mean payoff. Finally, it is decidable in pseudo-polynomial time whether an almost surely winning strategy exists.

Subject Classification

ACM Subject Classification

Theory of computation → Random walks and Markov chains
Mathematics of computing → Probability and statistics

Keywords

Markov decision processes
energy
mean payoff
parity
strategy complexity

Metrics

Access Statistics
Total Accesses (updated on a weekly basis)

0

PDF Downloads

0

Metadata Views

References

Pieter Abbeel and Andrew Y. Ng. Learning first-order Markov models for control. In Advances in Neural Information Processing Systems 17, pages 1-8. MIT Press, 2004.
Galit Ashkenazi-Golan, János Flesch, Arkadi Predtetchinski, and Eilon Solan. Reachability and safety objectives in Markov decision processes on long but finite horizons. Journal of Optimization Theory and Applications, 185:945-965, 2020.
Christel Baier and Joost-Pieter Katoen. Principles of Model Checking. MIT Press, 2008.
Patrick Billingsley. Probability and measure. John Wiley & Sons, 2008.
Vincent D. Blondel and John N. Tsitsiklis. A survey of computational complexity results in systems and control. Automatica, 36(9):1249-1274, 2000.
T. Brázdil, A. Kučera, and P. Novotný. Optimizing the expected mean payoff in energy Markov decision processes. In International Symposium on Automated Technology for Verification and Analysis (ATVA), volume 9938 of LNCS, pages 32-49, 2016.
Tomáš Brázdil, Václav Brožek, Krishnendu Chatterjee, Vojtěch Forejt, and Antonín Kučera. Markov decision processes with multiple long-run average objectives. Logical Methods in Computer Science, 10, 2014.
Véronique Bruyère, Quentin Hautem, Mickael Randour, and Jean-François Raskin. Energy Mean-Payoff Games. In Wan Fokkink and Rob van Glabbeek, editors, 30th International Conference on Concurrency Theory (CONCUR 2019), volume 140 of Leibniz International Proceedings in Informatics (LIPIcs), pages 21:1-21:17, Dagstuhl, Germany, 2019. Schloss Dagstuhl - Leibniz-Zentrum für Informatik. URL: http://dx.doi.org/10.4230/LIPIcs.CONCUR.2019.21.
Nicole Bäuerle and Ulrich Rieder. Markov Decision Processes with Applications to Finance. Springer-Verlag Berlin Heidelberg, 2011.
Arindam Chakrabarti, Luca De Alfaro, Thomas A. Henzinger, and Mariëlle Stoelinga. Resource interfaces. In International Workshop on Embedded Software, pages 117-133, 2003.
K. Chatterjee and T. Henzinger. A survey of stochastic ω-regular games. Journal of Computer and System Sciences, 78(2):394-413, 2012.
Krishnendu Chatterjee and Laurent Doyen. Energy and mean-payoff parity Markov decision processes. In International Symposium on Mathematical Foundations of Computer Science (MFCS), volume 6907, pages 206-218, 2011.
Krishnendu Chatterjee, Thomas A. Henzinger, and Marcin Jurdziński. Mean-payoff parity games. In Logic in Computer Science (LICS), pages 178-187, 2005.
Edmund M. Clarke, Thomas A. Henzinger, Helmut Veith, and Roderick Bloem, editors. Handbook of Model Checking. Springer, 2018. URL: http://dx.doi.org/10.1007/978-3-319-10575-8.
E.M. Clarke, O. Grumberg, and D. Peled. Model Checking. MIT Press, December 1999.
Lorenzo Clemente and Jean-Francois Raskin. Multidimensional beyond worst-case and almost-sure problems for mean-payoff objectives. In Proceedings of the 2015 30th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS), pages 257-268, 2015.
Mohan Dantam and Richard Mayr. Approximating the value of energy-parity objectives in simple stochastic games. In 48th International Symposium on Mathematical Foundations of Computer Science (MFCS 2023), pages 38:1-38:15, 2023. URL: http://dx.doi.org/10.4230/LIPIcs.MFCS.2023.38.
Mohan Dantam and Richard Mayr. Finite-memory Strategies for Almost-sure Energy-MeanPayoff Objectives in MDPs, 2024. URL: http://arxiv.org/abs/2404.14522.
Luca De Alfaro. Formal verification of probabilistic systems. PhD thesis, Stanford University, 1997.
János Flesch, Arkadi Predtetchinski, and William Sudderth. Simplifying optimal strategies in limsup and liminf stochastic games. Discrete Applied Mathematics, 251:40-56, 2018.
Dean Gillette. Stochastic games with zero stop probabilities. Contributions to the Theory of Games, 3:179-187, 1957.
Hugo Gimbert, Youssouf Oualhadj, and Soumya Paul. Computing optimal strategies for Markov decision processes with parity and positive-average conditions. Working paper or preprint, 2011.
T.P. Hill and V.C. Pestien. The existence of good Markov strategies for decision processes with general payoffs. Stoch. Processes and Appl., 24:61-76, 1987.
M. Jurdziński. Deciding the winner in parity games is in UP ∩ co-UP. Information Processing Letters, 68:119-124, 1998.
Richard Mayr, Sven Schewe, Patrick Totzke, and Dominik Wojtczak. MDPs with Energy-Parity Objectives. In Logic in Computer Science (LICS). IEEE, 2017.
Richard Mayr, Sven Schewe, Patrick Totzke, and Dominik Wojtczak. Simple stochastic games with almost-sure energy-parity objectives are in NP and coNP. In Proc. of Fossacs, volume 12650 of LNCS, 2021. Extended version on arXiv. URL: http://arxiv.org/abs/2101.06989.
A. Puri. Theory of hybrid systems and discrete event structures. PhD thesis, University of California, Berkeley, 1995.
Martin L. Puterman. Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons, Inc., New York, NY, USA, 1st edition, 1994.
Manfred Schäl. Markov decision processes in finance and dynamic options. In Handbook of Markov Decision Processes, pages 461-487. Springer, 2002.
Olivier Sigaud and Olivier Buffet. Markov Decision Processes in Artificial Intelligence. John Wiley & Sons, 2013.
William D. Sudderth. Optimal Markov strategies. Decisions in Economics and Finance, 43:43-54, 2020.
R.S. Sutton and A.G Barto. Reinforcement Learning: An Introduction. Adaptive Computation and Machine Learning. MIT Press, 2018.

Finite-Memory Strategies for Almost-Sure Energy-MeanPayoff Objectives in MDPs

Authors Mohan Dantam, Richard Mayr

File

Document Identifiers

Author Details

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

References

Thanks for your feedback!

Could not send message

Finite-Memory Strategies for Almost-Sure Energy-MeanPayoff Objectives in MDPs

Authors Mohan Dantam, Richard Mayr

File

Document Identifiers

Author Details

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

Related Versions

References

Thanks for your feedback!

Could not send message