On Skolem-Hardness and Saturation Points in Markov Decision Processes

Authors Jakob Piribauer , Christel Baier



PDF
Thumbnail PDF

File

LIPIcs.ICALP.2020.138.pdf
  • Filesize: 0.59 MB
  • 17 pages

Document Identifiers

Author Details

Jakob Piribauer
  • Technische Universität Dresden, Germany
Christel Baier
  • Technische Universität Dresden, Germany

Cite As Get BibTex

Jakob Piribauer and Christel Baier. On Skolem-Hardness and Saturation Points in Markov Decision Processes. In 47th International Colloquium on Automata, Languages, and Programming (ICALP 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 168, pp. 138:1-138:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020) https://doi.org/10.4230/LIPIcs.ICALP.2020.138

Abstract

The Skolem problem and the related Positivity problem for linear recurrence sequences are outstanding number-theoretic problems whose decidability has been open for many decades. In this paper, the inherent mathematical difficulty of a series of optimization problems on Markov decision processes (MDPs) is shown by a reduction from the Positivity problem to the associated decision problems which establishes that the problems are also at least as hard as the Skolem problem as an immediate consequence. The optimization problems under consideration are two non-classical variants of the stochastic shortest path problem (SSPP) in terms of expected partial or conditional accumulated weights, the optimization of the conditional value-at-risk for accumulated weights, and two problems addressing the long-run satisfaction of path properties, namely the optimization of long-run probabilities of regular co-safety properties and the model-checking problem of the logic frequency-LTL. To prove the Positivity- and hence Skolem-hardness for the latter two problems, a new auxiliary path measure, called weighted long-run frequency, is introduced and the Positivity-hardness of the corresponding decision problem is shown as an intermediate step. For the partial and conditional SSPP on MDPs with non-negative weights and for the optimization of long-run probabilities of constrained reachability properties (aU b), solutions are known that rely on the identification of a bound on the accumulated weight or the number of consecutive visits to certain sates, called a saturation point, from which on optimal schedulers behave memorylessly. In this paper, it is shown that also the optimization of the conditional value-at-risk for the classical SSPP and of weighted long-run frequencies on MDPs with non-negative weights can be solved in pseudo-polynomial time exploiting the existence of a saturation point. As a consequence, one obtains the decidability of the qualitative model-checking problem of a frequency-LTL formula that is not included in the fragments with known solutions.

Subject Classification

ACM Subject Classification
  • Theory of computation → Probabilistic computation
  • Theory of computation → Logic and verification
Keywords
  • Markov decision process
  • Skolem problem
  • stochastic shortest path
  • conditional expectation
  • conditional value-at-risk
  • model checking
  • frequency-LTL

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Carlo Acerbi and Dirk Tasche. Expected shortfall: A natural coherent alternative to value at risk. Economic Notes, 31(2):379-388, 2002. URL: https://doi.org/10.1111/1468-0300.00091.
  2. S Akshay, Timos Antonopoulos, Joël Ouaknine, and James Worrell. Reachability problems for Markov chains. Information Processing Letters, 115(2):155-158, 2015. Google Scholar
  3. Christel Baier, Nathalie Bertrand, Clemens Dubslaff, Daniel Gburek, and Ocan Sankur. Stochastic shortest paths and weight-bounded properties in Markov decision processes. In Proceedings of the 33rd Annual ACM/IEEE Symposium on Logic in Computer Science (LICS'18), pages 86-94. ACM, 2018. URL: https://doi.org/10.1145/3209108.3209184.
  4. Christel Baier, Nathalie Bertrand, Jakob Piribauer, and Ocan Sankur. Long-run satisfaction of path properties. In Proceedings of the Thirty-Fourth Annual ACM/IEEE Symposium on Logic in Computer Science (LICS'19), pages 1-14. IEEE, 2019. Google Scholar
  5. Christel Baier, Clemens Dubslaff, Joachim Klein, Sascha Klüppelholz, and Sascha Wunderlich. Probabilistic model checking for energy-utility analysis. In Franck van Breugel, Elham Kashefi, Catuscia Palamidessi, and Jan Rutten, editors, Horizons of the Mind. A Tribute to Prakash Panangaden, volume 8464 of LNCS, pages 96-123. Springer, 2014. Google Scholar
  6. Christel Baier, Marcus Größer, and Nathalie Bertrand. Probabilistic ω-automata. Journal of the ACM, 59(1):1:1-1:52, 2012. URL: https://doi.org/10.1145/2108242.2108243.
  7. Christel Baier, Joachim Klein, Sascha Klüppelholz, and Sascha Wunderlich. Weight monitoring with linear temporal logic: Complexity and decidability. In Proceedings of the 23rd Conference on Computer Science Logic and the 29th Symposium on Logic In Computer Science (CSL-LICS'14), pages 11:1-11:10. ACM, 2014. URL: https://doi.org/10.1145/2603088.2603162.
  8. Christel Baier, Joachim Klein, Sascha Klüppelholz, and Sascha Wunderlich. Maximizing the conditional expected reward for reaching the goal. In Axel Legay and Tiziana Margaria, editors, Tools and Algorithms for the Construction and Analysis of Systems (TACAS'17), volume 10206 of Lecture Notes in Computer Science, pages 269-285. Springer, 2017. Google Scholar
  9. Gilles Barthe, Thomas Espitau, Luis María Ferrer Fioriti, and Justin Hsu. Synthesizing probabilistic invariants via Doob’s decomposition. In Swarat Chaudhuri and Azadeh Farzan, editors, Proceedings of the 28th International Conference on Computer Aided Verification (CAV'16), Part I, volume 9779 of LNCS, pages 43-61. Springer, 2016. Google Scholar
  10. Dimitri P Bertsekas and John N Tsitsiklis. An analysis of stochastic shortest path problems. Mathematics of Operations Research, 16(3):580-595, 1991. Google Scholar
  11. Tomás Brázdil, Václav Brožek, Kousha Etessami, Antonín Kučera, and Dominik Wojtczak. One-counter Markov decision processes. In Proceedings of the twenty-first annual ACM-SIAM symposium on Discrete Algorithms (SODA'10), pages 863-874. SIAM, 2010. Google Scholar
  12. Tomás Brázdil, Antonín Kucera, and Petr Novotný. Optimizing the expected mean payoff in energy Markov decision processes. In 14th International Symposium on Automated Technology for Verification and Analysis (ATVA'16), volume 9938 of Lecture Notes in Computer Science, pages 32-49, 2016. Google Scholar
  13. Pavol Cerný, Thomas A. Henzinger, and Arjun Radhakrishna. Simulation distances. Theoretical Computer Science, 413(1):21-35, 2012. Google Scholar
  14. Krishnendu Chatterjee and Laurent Doyen. Energy and mean-payoff parity Markov decision processes. In Proceedings of the 36th International Symposium on Mathematical Foundations of Computer Science (MFCS'11), volume 6907 of Lecture Notes in Computer Science, pages 206-218. Springer, 2011. Google Scholar
  15. Krishnendu Chatterjee, Hongfei Fu, and Amir Kafshdar Goharshady. Termination analysis of probabilistic programs through Positivstellensatz’s. In Swarat Chaudhuri and Azadeh Farzan, editors, Proceedings of the 28th International Conference on Computer Aided Verification (CAV'16), Part I, volume 9779 of LNCS, pages 3-22. Springer, 2016. Google Scholar
  16. Taolue Chen, Vojtěch Forejt, Marta Kwiatkowska, David Parker, and Aistis Simaitis. Automatic verification of competitive stochastic systems. Formal Methods in System Design, 43(1):61-92, 2013. Google Scholar
  17. Hana Chockler, Orna Kupferman, and Moshe Y. Vardi. Coverage metrics for temporal logic model checking. Formal Methods in System Design, 28(3):189-212, 2006. Google Scholar
  18. Ventsislav Chonev, Joël Ouaknine, and James Worrell. On the Skolem problem for continuous linear dynamical systems. In 43rd International Colloquium on Automata, Languages, and Programming, (ICALP'16), volume 55 of LIPIcs, pages 100:1-100:13. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2016. Google Scholar
  19. Costas Courcoubetis and Mihalis Yannakakis. The complexity of probabilistic verification. Journal of the ACM, 42(4):857-907, 1995. Google Scholar
  20. Luca de Alfaro. Computing minimum and maximum reachability times in probabilistic systems. In Jos C. M. Baeten and Sjouke Mauw, editors, Proceedings of the 10th International Conference on Concurrency Theory (CONCUR'99), volume 1664 of Lecture Notes in Computer Science, pages 66-81. Springer, 1999. Google Scholar
  21. Kousha Etessami and Mihalis Yannakakis. Recursive Markov decision processes and recursive stochastic games. In Luís Caires, Giuseppe F. Italiano, Luís Monteiro, Catuscia Palamidessi, and Moti Yung, editors, Proceedings of the 32nd International Colloquium on Automata, Languages and Programming (ICALP'05), volume 3580 of Lecture Notes in Computer Science, pages 891-903. Springer, 2005. Google Scholar
  22. Graham Everest, Alfred Jacobus Van Der Poorten, Igor Shparlinski, Thomas Ward, et al. Recurrence sequences, volume 104. American Mathematical Society, 2003. Google Scholar
  23. Vojtech Forejt and Jan Krcál. On frequency LTL in probabilistic systems. In Luca Aceto and David de Frutos-Escrig, editors, 26th International Conference on Concurrency Theory (CONCUR'15), volume 42 of LIPIcs, pages 184-197. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2015. URL: https://doi.org/10.4230/LIPIcs.CONCUR.2015.184.
  24. Vojtech Forejt, Jan Krcál, and Jan Kretínský. Controller synthesis for MDPs and frequency LTL_⧵ GU. In Martin Davis, Ansgar Fehnker, Annabelle McIver, and Andrei Voronkov, editors, 20th International Conference on Logic for Programming, Artificial Intelligence, and Reasoning (LPAR'15), volume 9450 of Lecture Notes in Computer Science, pages 162-177. Springer, 2015. URL: https://doi.org/10.1007/978-3-662-48899-7_12.
  25. Friedrich Gretz, Joost-Pieter Katoen, and Annabelle McIver. Operational versus weakest pre-expectation semantics for the probabilistic guarded command language. Performance Evaluation, 73:110-132, 2014. Google Scholar
  26. Christoph Haase and Stefan Kiefer. The odds of staying on budget. In Magnús M. Halldórsson, Kazuo Iwama, Naoki Kobayashi, and Bettina Speckmann, editors, Proceedings of the 42nd International Colloquium on Automata, Languages, and Programming (ICALP'15), volume 9134, pages 234-246. Springer, 2015. Google Scholar
  27. Vesa Halava, Tero Harju, Mika Hirvensalo, and Juhani Karhumäki. Skolem’s problem-on the border between decidability and undecidability. Technical report, Technical Report 683, Turku Centre for Computer Science, 2005. Google Scholar
  28. Thomas A. Henzinger. Quantitative reactive modeling and verification. Computer Science - Research and Development, 28(4):331-344, November 2013. URL: https://doi.org/10.1007/s00450-013-0251-7.
  29. Thomas A. Henzinger and Jan Otop. From model checking to model measuring. In 24th International Conference on Concurrency Theory (CONCUR), volume 8052 of Lecture Notes in Computer Science, pages 273-287. Springer, 2013. Google Scholar
  30. Joost-Pieter Katoen, Friedrich Gretz, Nils Jansen, Benjamin Lucien Kaminski, and Federico Olmedo. Understanding probabilistic programs. In Roland Meyer, André Platzer, and Heike Wehrheim, editors, Correct System Design - Proceedings of the Symposium in Honor of Ernst-Rüdiger Olderog on the Occasion of His 60th Birthday, volume 9360 of LNCS, pages 15-32. Springer, 2015. Google Scholar
  31. Jan Kretínský and Tobias Meggendorfer. Conditional value-at-risk for reachability and mean payoff in Markov decision processes. In Proceedings of the 33rd Annual ACM/IEEE Symposium on Logic in Computer Science (LICS'18), pages 609-618. ACM, 2018. URL: https://doi.org/10.1145/3209108.3209176.
  32. Orna Kupferman and Moshe Y. Vardi. Robust satisfaction. In 10th International Conference on Concurrency Theory (CONCUR), volume 1664 of Lecture Notes in Computer Science, pages 383-398. Springer, 1999. Google Scholar
  33. Omid Madani, Steve Hanks, and Anne Condon. On the undecidability of probabilistic planning and infinite-horizon partially observable Markov decision problems. In Proceedings of the Sixteenth National Conference on Artificial Intelligence (AAAI'99), pages 541-548. MIT Press, 1999. Google Scholar
  34. Rupak Majumdar, Mahmoud Salamati, and Sadegh Soudjani. On decidability of time-bounded reachability in CTMDPs. In 47th International Colloquium on Automata, Languages, and Programming, (ICALP'20), to appear. Google Scholar
  35. Richard Mayr, Sven Schewe, Patrick Totzke, and Dominik Wojtczak. MDPs with energy-parity objectives. In 32nd Annual ACM/IEEE Symposium on Logic in Computer Science (LICS'17), IEEE Computer Society, pages 1-12, 2017. Google Scholar
  36. Federico Olmedo, Friedrich Gretz, Nils Jansen, Benjamin Lucien Kaminski, Joost-Pieter Katoen, and Annabelle Mciver. Conditioning in probabilistic programming. ACM Transactions on Programming Languages and Systems, 40(1):4:1-4:50, 2018. Google Scholar
  37. Joël Ouaknine and James Worrell. Positivity problems for low-order linear recurrence sequences. In Proceedings of the twenty-fifth annual ACM-SIAM symposium on Discrete algorithms (SODA'14), pages 366-379. SIAM, 2014. Google Scholar
  38. Jakob Piribauer and Christel Baier. Partial and conditional expectations in Markov decision processes with integer weights. In Mikolaj Bojanczyk and Alex Simpson, editors, Proceedings of the 22nd International Conference on Foundations of Software Science and Computation Structures (FoSSaCS'19), volume 11425 of Lecture Notes in Computer Science, pages 436-452. Springer, 2019. Google Scholar
  39. Jakob Piribauer and Christel Baier. On Skolem-hardness and saturation points in Markov decision processes, 2020. available at URL: https://arxiv.org/abs/2004.11441.
  40. Martin L Puterman. Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons, Inc., 1994. Google Scholar
  41. Mickael Randour, Jean-François Raskin, and Ocan Sankur. Variations on the stochastic shortest path problem. In International Workshop on Verification, Model Checking, and Abstract Interpretation, pages 1-18. Springer, 2015. Google Scholar
  42. Mickael Randour, Jean-François Raskin, and Ocan Sankur. Percentile queries in multi-dimensional Markov decision processes. Formal methods in system design, 50(2-3):207-248, 2017. Google Scholar
  43. Paulo Tabuada and Daniel Neider. Robust linear temporal logic. In 25th EACSL Annual Conference on Computer Science Logic (CSL), volume 62 of LIPIcs, pages 10:1-10:21. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 2016. Google Scholar
  44. Michael Ummels and Christel Baier. Computing quantiles in Markov reward models. In Frank Pfenning, editor, Proc. of the 16th International Conference on Foundations of Software Science and Computation Structures (FOSSACS'13), volume 7794 of Lecture Notes in Computer Science, pages 353-368. Springer, 2013. URL: https://doi.org/10.1007/978-3-642-37075-5_23.
  45. Stanislav Uryasev. Conditional value-at-risk: optimization algorithms and applications. In Proc. Computational Intelligence and Financial Engineering (CIFEr), pages 49-57. IEEE, 2000. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail