On Skolem-Hardness and Saturation Points in Markov Decision Processes

Piribauer, Jakob; Baier, Christel

doi:10.4230/LIPIcs.ICALP.2020.138

Abstract

The Skolem problem and the related Positivity problem for linear recurrence sequences are outstanding number-theoretic problems whose decidability has been open for many decades. In this paper, the inherent mathematical difficulty of a series of optimization problems on Markov decision processes (MDPs) is shown by a reduction from the Positivity problem to the associated decision problems which establishes that the problems are also at least as hard as the Skolem problem as an immediate consequence. The optimization problems under consideration are two non-classical variants of the stochastic shortest path problem (SSPP) in terms of expected partial or conditional accumulated weights, the optimization of the conditional value-at-risk for accumulated weights, and two problems addressing the long-run satisfaction of path properties, namely the optimization of long-run probabilities of regular co-safety properties and the model-checking problem of the logic frequency-LTL. To prove the Positivity- and hence Skolem-hardness for the latter two problems, a new auxiliary path measure, called weighted long-run frequency, is introduced and the Positivity-hardness of the corresponding decision problem is shown as an intermediate step. For the partial and conditional SSPP on MDPs with non-negative weights and for the optimization of long-run probabilities of constrained reachability properties (aU b), solutions are known that rely on the identification of a bound on the accumulated weight or the number of consecutive visits to certain sates, called a saturation point, from which on optimal schedulers behave memorylessly. In this paper, it is shown that also the optimization of the conditional value-at-risk for the classical SSPP and of weighted long-run frequencies on MDPs with non-negative weights can be solved in pseudo-polynomial time exploiting the existence of a saturation point. As a consequence, one obtains the decidability of the qualitative model-checking problem of a frequency-LTL formula that is not included in the fragments with known solutions.

Carlo Acerbi and Dirk Tasche. Expected shortfall: A natural coherent alternative to value at risk. Economic Notes, 31(2):379-388, 2002. URL: https://doi.org/10.1111/1468-0300.00091.
S Akshay, Timos Antonopoulos, Joël Ouaknine, and James Worrell. Reachability problems for Markov chains. Information Processing Letters, 115(2):155-158, 2015.
Christel Baier, Nathalie Bertrand, Clemens Dubslaff, Daniel Gburek, and Ocan Sankur. Stochastic shortest paths and weight-bounded properties in Markov decision processes. In Proceedings of the 33rd Annual ACM/IEEE Symposium on Logic in Computer Science (LICS'18), pages 86-94. ACM, 2018. URL: https://doi.org/10.1145/3209108.3209184.
Christel Baier, Nathalie Bertrand, Jakob Piribauer, and Ocan Sankur. Long-run satisfaction of path properties. In Proceedings of the Thirty-Fourth Annual ACM/IEEE Symposium on Logic in Computer Science (LICS'19), pages 1-14. IEEE, 2019.
Christel Baier, Clemens Dubslaff, Joachim Klein, Sascha Klüppelholz, and Sascha Wunderlich. Probabilistic model checking for energy-utility analysis. In Franck van Breugel, Elham Kashefi, Catuscia Palamidessi, and Jan Rutten, editors, Horizons of the Mind. A Tribute to Prakash Panangaden, volume 8464 of LNCS, pages 96-123. Springer, 2014.
Christel Baier, Marcus Größer, and Nathalie Bertrand. Probabilistic ω-automata. Journal of the ACM, 59(1):1:1-1:52, 2012. URL: https://doi.org/10.1145/2108242.2108243.
Christel Baier, Joachim Klein, Sascha Klüppelholz, and Sascha Wunderlich. Weight monitoring with linear temporal logic: Complexity and decidability. In Proceedings of the 23rd Conference on Computer Science Logic and the 29th Symposium on Logic In Computer Science (CSL-LICS'14), pages 11:1-11:10. ACM, 2014. URL: https://doi.org/10.1145/2603088.2603162.
Christel Baier, Joachim Klein, Sascha Klüppelholz, and Sascha Wunderlich. Maximizing the conditional expected reward for reaching the goal. In Axel Legay and Tiziana Margaria, editors, Tools and Algorithms for the Construction and Analysis of Systems (TACAS'17), volume 10206 of Lecture Notes in Computer Science, pages 269-285. Springer, 2017.
Gilles Barthe, Thomas Espitau, Luis María Ferrer Fioriti, and Justin Hsu. Synthesizing probabilistic invariants via Doob’s decomposition. In Swarat Chaudhuri and Azadeh Farzan, editors, Proceedings of the 28th International Conference on Computer Aided Verification (CAV'16), Part I, volume 9779 of LNCS, pages 43-61. Springer, 2016.
Dimitri P Bertsekas and John N Tsitsiklis. An analysis of stochastic shortest path problems. Mathematics of Operations Research, 16(3):580-595, 1991.
Tomás Brázdil, Václav Brožek, Kousha Etessami, Antonín Kučera, and Dominik Wojtczak. One-counter Markov decision processes. In Proceedings of the twenty-first annual ACM-SIAM symposium on Discrete Algorithms (SODA'10), pages 863-874. SIAM, 2010.
Tomás Brázdil, Antonín Kucera, and Petr Novotný. Optimizing the expected mean payoff in energy Markov decision processes. In 14th International Symposium on Automated Technology for Verification and Analysis (ATVA'16), volume 9938 of Lecture Notes in Computer Science, pages 32-49, 2016.
Pavol Cerný, Thomas A. Henzinger, and Arjun Radhakrishna. Simulation distances. Theoretical Computer Science, 413(1):21-35, 2012.
Krishnendu Chatterjee and Laurent Doyen. Energy and mean-payoff parity Markov decision processes. In Proceedings of the 36th International Symposium on Mathematical Foundations of Computer Science (MFCS'11), volume 6907 of Lecture Notes in Computer Science, pages 206-218. Springer, 2011.
Krishnendu Chatterjee, Hongfei Fu, and Amir Kafshdar Goharshady. Termination analysis of probabilistic programs through Positivstellensatz’s. In Swarat Chaudhuri and Azadeh Farzan, editors, Proceedings of the 28th International Conference on Computer Aided Verification (CAV'16), Part I, volume 9779 of LNCS, pages 3-22. Springer, 2016.
Taolue Chen, Vojtěch Forejt, Marta Kwiatkowska, David Parker, and Aistis Simaitis. Automatic verification of competitive stochastic systems. Formal Methods in System Design, 43(1):61-92, 2013.
Hana Chockler, Orna Kupferman, and Moshe Y. Vardi. Coverage metrics for temporal logic model checking. Formal Methods in System Design, 28(3):189-212, 2006.
Ventsislav Chonev, Joël Ouaknine, and James Worrell. On the Skolem problem for continuous linear dynamical systems. In 43rd International Colloquium on Automata, Languages, and Programming, (ICALP'16), volume 55 of LIPIcs, pages 100:1-100:13. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2016.
Costas Courcoubetis and Mihalis Yannakakis. The complexity of probabilistic verification. Journal of the ACM, 42(4):857-907, 1995.
Luca de Alfaro. Computing minimum and maximum reachability times in probabilistic systems. In Jos C. M. Baeten and Sjouke Mauw, editors, Proceedings of the 10th International Conference on Concurrency Theory (CONCUR'99), volume 1664 of Lecture Notes in Computer Science, pages 66-81. Springer, 1999.
Kousha Etessami and Mihalis Yannakakis. Recursive Markov decision processes and recursive stochastic games. In Luís Caires, Giuseppe F. Italiano, Luís Monteiro, Catuscia Palamidessi, and Moti Yung, editors, Proceedings of the 32nd International Colloquium on Automata, Languages and Programming (ICALP'05), volume 3580 of Lecture Notes in Computer Science, pages 891-903. Springer, 2005.
Graham Everest, Alfred Jacobus Van Der Poorten, Igor Shparlinski, Thomas Ward, et al. Recurrence sequences, volume 104. American Mathematical Society, 2003.
Vojtech Forejt and Jan Krcál. On frequency LTL in probabilistic systems. In Luca Aceto and David de Frutos-Escrig, editors, 26th International Conference on Concurrency Theory (CONCUR'15), volume 42 of LIPIcs, pages 184-197. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2015. URL: https://doi.org/10.4230/LIPIcs.CONCUR.2015.184.
Vojtech Forejt, Jan Krcál, and Jan Kretínský. Controller synthesis for MDPs and frequency LTL_⧵ GU. In Martin Davis, Ansgar Fehnker, Annabelle McIver, and Andrei Voronkov, editors, 20th International Conference on Logic for Programming, Artificial Intelligence, and Reasoning (LPAR'15), volume 9450 of Lecture Notes in Computer Science, pages 162-177. Springer, 2015. URL: https://doi.org/10.1007/978-3-662-48899-7_12.
Friedrich Gretz, Joost-Pieter Katoen, and Annabelle McIver. Operational versus weakest pre-expectation semantics for the probabilistic guarded command language. Performance Evaluation, 73:110-132, 2014.
Christoph Haase and Stefan Kiefer. The odds of staying on budget. In Magnús M. Halldórsson, Kazuo Iwama, Naoki Kobayashi, and Bettina Speckmann, editors, Proceedings of the 42nd International Colloquium on Automata, Languages, and Programming (ICALP'15), volume 9134, pages 234-246. Springer, 2015.
Vesa Halava, Tero Harju, Mika Hirvensalo, and Juhani Karhumäki. Skolem’s problem-on the border between decidability and undecidability. Technical report, Technical Report 683, Turku Centre for Computer Science, 2005.
Thomas A. Henzinger. Quantitative reactive modeling and verification. Computer Science - Research and Development, 28(4):331-344, November 2013. URL: https://doi.org/10.1007/s00450-013-0251-7.
Thomas A. Henzinger and Jan Otop. From model checking to model measuring. In 24th International Conference on Concurrency Theory (CONCUR), volume 8052 of Lecture Notes in Computer Science, pages 273-287. Springer, 2013.
Joost-Pieter Katoen, Friedrich Gretz, Nils Jansen, Benjamin Lucien Kaminski, and Federico Olmedo. Understanding probabilistic programs. In Roland Meyer, André Platzer, and Heike Wehrheim, editors, Correct System Design - Proceedings of the Symposium in Honor of Ernst-Rüdiger Olderog on the Occasion of His 60th Birthday, volume 9360 of LNCS, pages 15-32. Springer, 2015.
Jan Kretínský and Tobias Meggendorfer. Conditional value-at-risk for reachability and mean payoff in Markov decision processes. In Proceedings of the 33rd Annual ACM/IEEE Symposium on Logic in Computer Science (LICS'18), pages 609-618. ACM, 2018. URL: https://doi.org/10.1145/3209108.3209176.
Orna Kupferman and Moshe Y. Vardi. Robust satisfaction. In 10th International Conference on Concurrency Theory (CONCUR), volume 1664 of Lecture Notes in Computer Science, pages 383-398. Springer, 1999.
Omid Madani, Steve Hanks, and Anne Condon. On the undecidability of probabilistic planning and infinite-horizon partially observable Markov decision problems. In Proceedings of the Sixteenth National Conference on Artificial Intelligence (AAAI'99), pages 541-548. MIT Press, 1999.
Rupak Majumdar, Mahmoud Salamati, and Sadegh Soudjani. On decidability of time-bounded reachability in CTMDPs. In 47th International Colloquium on Automata, Languages, and Programming, (ICALP'20), to appear.
Richard Mayr, Sven Schewe, Patrick Totzke, and Dominik Wojtczak. MDPs with energy-parity objectives. In 32nd Annual ACM/IEEE Symposium on Logic in Computer Science (LICS'17), IEEE Computer Society, pages 1-12, 2017.
Federico Olmedo, Friedrich Gretz, Nils Jansen, Benjamin Lucien Kaminski, Joost-Pieter Katoen, and Annabelle Mciver. Conditioning in probabilistic programming. ACM Transactions on Programming Languages and Systems, 40(1):4:1-4:50, 2018.
Joël Ouaknine and James Worrell. Positivity problems for low-order linear recurrence sequences. In Proceedings of the twenty-fifth annual ACM-SIAM symposium on Discrete algorithms (SODA'14), pages 366-379. SIAM, 2014.
Jakob Piribauer and Christel Baier. Partial and conditional expectations in Markov decision processes with integer weights. In Mikolaj Bojanczyk and Alex Simpson, editors, Proceedings of the 22nd International Conference on Foundations of Software Science and Computation Structures (FoSSaCS'19), volume 11425 of Lecture Notes in Computer Science, pages 436-452. Springer, 2019.
Jakob Piribauer and Christel Baier. On Skolem-hardness and saturation points in Markov decision processes, 2020. available at URL: https://arxiv.org/abs/2004.11441.
Martin L Puterman. Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons, Inc., 1994.
Mickael Randour, Jean-François Raskin, and Ocan Sankur. Variations on the stochastic shortest path problem. In International Workshop on Verification, Model Checking, and Abstract Interpretation, pages 1-18. Springer, 2015.
Mickael Randour, Jean-François Raskin, and Ocan Sankur. Percentile queries in multi-dimensional Markov decision processes. Formal methods in system design, 50(2-3):207-248, 2017.
Paulo Tabuada and Daniel Neider. Robust linear temporal logic. In 25th EACSL Annual Conference on Computer Science Logic (CSL), volume 62 of LIPIcs, pages 10:1-10:21. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 2016.
Michael Ummels and Christel Baier. Computing quantiles in Markov reward models. In Frank Pfenning, editor, Proc. of the 16th International Conference on Foundations of Software Science and Computation Structures (FOSSACS'13), volume 7794 of Lecture Notes in Computer Science, pages 353-368. Springer, 2013. URL: https://doi.org/10.1007/978-3-642-37075-5_23.
Stanislav Uryasev. Conditional value-at-risk: optimization algorithms and applications. In Proc. Computational Intelligence and Financial Engineering (CIFEr), pages 49-57. IEEE, 2000.

On Skolem-Hardness and Saturation Points in Markov Decision Processes

Authors Jakob Piribauer , Christel Baier

File

Document Identifiers

Author Details

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

References

Thanks for your feedback!

Could not send message

On Skolem-Hardness and Saturation Points in Markov Decision Processes

Authors Jakob Piribauer , Christel Baier

File

Document Identifiers

Author Details

Funding

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

Related Versions

References

Thanks for your feedback!

Could not send message