Life Is Random, Time Is Not: Markov Decision Processes with Window Objectives

Brihaye, Thomas; Delgrange, Florent; Oualhadj, Youssouf; Randour, Mickael

doi:10.4230/LIPIcs.CONCUR.2019.8

Abstract

The window mechanism was introduced by Chatterjee et al. [Krishnendu Chatterjee et al., 2015] to strengthen classical game objectives with time bounds. It permits to synthesize system controllers that exhibit acceptable behaviors within a configurable time frame, all along their infinite execution, in contrast to the traditional objectives that only require correctness of behaviors in the limit. The window concept has proved its interest in a variety of two-player zero-sum games, thanks to the ability to reason about such time bounds in system specifications, but also the increased tractability that it usually yields. In this work, we extend the window framework to stochastic environments by considering the fundamental threshold probability problem in Markov decision processes for window objectives. That is, given such an objective, we want to synthesize strategies that guarantee satisfying runs with a given probability. We solve this problem for the usual variants of window objectives, where either the time frame is set as a parameter, or we ask if such a time frame exists. We develop a generic approach for window-based objectives and instantiate it for the classical mean-payoff and parity objectives, already considered in games. Our work paves the way to a wide use of the window mechanism in stochastic models.

Parosh Aziz Abdulla, Noomene Ben Henda, and Richard Mayr. Decisive Markov Chains. Logical Methods in Computer Science, 3(4), 2007. URL: https://doi.org/10.2168/LMCS-3(4:7)2007.
Pranav Ashok, Krishnendu Chatterjee, Przemyslaw Daca, Jan Kretínský, and Tobias Meggendorfer. Value Iteration for Long-Run Average Reward in Markov Decision Processes. In Rupak Majumdar and Viktor Kuncak, editors, Computer Aided Verification - 29th International Conference, CAV 2017, Heidelberg, Germany, July 24-28, 2017, Proceedings, Part I, volume 10426 of Lecture Notes in Computer Science, pages 201-221. Springer, 2017. URL: https://doi.org/10.1007/978-3-319-63387-9_10.
Christel Baier. Reasoning About Cost-Utility Constraints in Probabilistic Models. In Mikolaj Bojanczyk, Slawomir Lasota, and Igor Potapov, editors, Reachability Problems - 9th International Workshop, RP 2015, Warsaw, Poland, September 21-23, 2015, Proceedings, volume 9328 of Lecture Notes in Computer Science, pages 1-6. Springer, 2015. URL: https://doi.org/10.1007/978-3-319-24537-9_1.
Christel Baier and Joost-Pieter Katoen. Principles of model checking. MIT press, 2008.
Christel Baier, Joachim Klein, Sascha Klüppelholz, and Sascha Wunderlich. Weight monitoring with linear temporal logic: complexity and decidability. In Thomas A. Henzinger and Dale Miller, editors, Joint Meeting of the Twenty-Third EACSL Annual Conference on Computer Science Logic (CSL) and the Twenty-Ninth Annual ACM/IEEE Symposium on Logic in Computer Science (LICS), CSL-LICS '14, Vienna, Austria, July 14 - 18, 2014, pages 11:1-11:10. ACM, 2014. URL: https://doi.org/10.1145/2603088.2603162.
Catriel Beeri. On the Membership Problem for Functional and Multivalued Dependencies in Relational Databases. ACM Trans. Database Syst., 5(3):241-259, 1980. URL: https://doi.org/10.1145/320613.320614.
Raphaël Berthon, Mickael Randour, and Jean-François Raskin. Threshold Constraints with Guarantees for Parity Objectives in Markov Decision Processes. In Ioannis Chatzigiannakis, Piotr Indyk, Fabian Kuhn, and Anca Muscholl, editors, 44th International Colloquium on Automata, Languages, and Programming, ICALP 2017, July 10-14, 2017, Warsaw, Poland, volume 80 of LIPIcs, pages 121:1-121:15. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 2017. URL: https://doi.org/10.4230/LIPIcs.ICALP.2017.121.
Benjamin Bordais, Shibashis Guha, and Jean-François Raskin. Expected Window Mean-Payoff. CoRR, abs/1812.09298, 2018. URL: http://arxiv.org/abs/1812.09298.
Patricia Bouyer, Mauricio González, Nicolas Markey, and Mickael Randour. Multi-weighted Markov Decision Processes with Reachability Objectives. In Andrea Orlandini and Martin Zimmermann, editors, Proceedings Ninth International Symposium on Games, Automata, Logics, and Formal Verification, GandALF 2018, Saarbrücken, Germany, 26-28th September 2018., volume 277 of EPTCS, pages 250-264, 2018. URL: https://doi.org/10.4204/EPTCS.277.18.
Tomás Brázdil, Krishnendu Chatterjee, Vojtech Forejt, and Antonín Kucera. Trading performance for stability in Markov decision processes. J. Comput. Syst. Sci., 84:144-170, 2017. URL: https://doi.org/10.1016/j.jcss.2016.09.009.
Tomás Brázdil, Vojtech Forejt, Antonín Kucera, and Petr Novotný. Stability in Graphs and Games. In Josée Desharnais and Radha Jagadeesan, editors, 27th International Conference on Concurrency Theory, CONCUR 2016, August 23-26, 2016, Québec City, Canada, volume 59 of LIPIcs, pages 10:1-10:14. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 2016. URL: https://doi.org/10.4230/LIPIcs.CONCUR.2016.10.
Thomas Brihaye, Florent Delgrange, Youssouf Oualhadj, and Mickael Randour. Life is Random, Time is Not: Markov Decision Processes with Window Objectives. CoRR, abs/1901.03571, 2019. URL: http://arxiv.org/abs/1901.03571.
Véronique Bruyère, Emmanuel Filiot, Mickael Randour, and Jean-François Raskin. Meet your expectations with guarantees: Beyond worst-case synthesis in quantitative games. Inf. Comput., 254:259-295, 2017. URL: https://doi.org/10.1016/j.ic.2016.10.011.
Véronique Bruyère, Quentin Hautem, and Mickael Randour. Window parity games: an alternative approach toward parity games with time bounds. In Domenico Cantone and Giorgio Delzanno, editors, Proceedings of the Seventh International Symposium on Games, Automata, Logics and Formal Verification, GandALF 2016, Catania, Italy, 14-16 September 2016., volume 226 of EPTCS, pages 135-148, 2016. URL: https://doi.org/10.4204/EPTCS.226.10.
Véronique Bruyère, Quentin Hautem, and Jean-François Raskin. On the Complexity of Heterogeneous Multidimensional Games. In Josée Desharnais and Radha Jagadeesan, editors, 27th International Conference on Concurrency Theory, CONCUR 2016, August 23-26, 2016, Québec City, Canada, volume 59 of LIPIcs, pages 11:1-11:15. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 2016. URL: https://doi.org/10.4230/LIPIcs.CONCUR.2016.11.
Cristian S. Calude, Sanjay Jain, Bakhadyr Khoussainov, Wei Li, and Frank Stephan. Deciding parity games in quasipolynomial time. In Hamed Hatami, Pierre McKenzie, and Valerie King, editors, Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2017, Montreal, QC, Canada, June 19-23, 2017, pages 252-263. ACM, 2017. URL: https://doi.org/10.1145/3055399.3055409.
Krishnendu Chatterjee, Laurent Doyen, Mickael Randour, and Jean-François Raskin. Looking at mean-payoff and total-payoff through windows. Inf. Comput., 242:25-52, 2015. URL: https://doi.org/10.1016/j.ic.2015.03.010.
Krishnendu Chatterjee and Monika Henzinger. Efficient and Dynamic Algorithms for Alternating Büchi Games and Maximal End-Component Decomposition. J. ACM, 61(3):15:1-15:40, 2014. URL: https://doi.org/10.1145/2597631.
Krishnendu Chatterjee, Thomas A. Henzinger, and Florian Horn. Finitary winning in omega-regular games. ACM Trans. Comput. Log., 11(1):1:1-1:27, 2009. URL: https://doi.org/10.1145/1614431.1614432.
Krishnendu Chatterjee, Marcin Jurdzinski, and Thomas A. Henzinger. Quantitative stochastic parity games. In J. Ian Munro, editor, Proceedings of the Fifteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2004, New Orleans, Louisiana, USA, January 11-14, 2004, pages 121-130. SIAM, 2004. URL: http://dl.acm.org/citation.cfm?id=982792.982808.
Wojciech Czerwinski, Laure Daviaud, Nathanaël Fijalkow, Marcin Jurdzinski, Ranko Lazic, and Pawel Parys. Universal trees grow inside separating automata: Quasi-polynomial lower bounds for parity games. CoRR, abs/1807.10546, 2018. URL: http://arxiv.org/abs/1807.10546.
Laure Daviaud, Marcin Jurdzinski, and Ranko Lazic. A pseudo-quasi-polynomial algorithm for mean-payoff parity games. In Anuj Dawar and Erich Grädel, editors, Proceedings of the 33rd Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2018, Oxford, UK, July 09-12, 2018, pages 325-334. ACM, 2018. URL: https://doi.org/10.1145/3209108.3209162.
Christian Dehnert, Sebastian Junges, Joost-Pieter Katoen, and Matthias Volk. A Storm is Coming: A Modern Probabilistic Model Checker. In Rupak Majumdar and Viktor Kuncak, editors, Computer Aided Verification - 29th International Conference, CAV 2017, Heidelberg, Germany, July 24-28, 2017, Proceedings, Part II, volume 10427 of Lecture Notes in Computer Science, pages 592-600. Springer, 2017. URL: https://doi.org/10.1007/978-3-319-63390-9_31.
John Fearnley and Marcin Jurdzinski. Reachability in two-clock timed automata is PSPACE-complete. Inf. Comput., 243:26-36, 2015. URL: https://doi.org/10.1016/j.ic.2014.12.004.
Nathanaël Fijalkow, Pawel Gawrychowski, and Pierre Ohlmann. The complexity of mean payoff games using universal graphs. CoRR, abs/1812.07072, 2018. URL: http://arxiv.org/abs/1812.07072.
Jerzy Filar and Koos Vrieze. Competitive Markov decision processes. Springer, 1997.
Thomas Gawlitza and Helmut Seidl. Games through Nested Fixpoints. In Ahmed Bouajjani and Oded Maler, editors, Computer Aided Verification, 21st International Conference, CAV 2009, Grenoble, France, June 26 - July 2, 2009. Proceedings, volume 5643 of Lecture Notes in Computer Science, pages 291-305. Springer, 2009. URL: https://doi.org/10.1007/978-3-642-02658-4_24.
Erich Grädel, Wolfgang Thomas, and Thomas Wilke, editors. Automata, Logics, and Infinite Games: A Guide to Current Research [outcome of a Dagstuhl seminar, February 2001], volume 2500 of Lecture Notes in Computer Science. Springer, 2002. URL: https://doi.org/10.1007/3-540-36387-4.
Charles M. Grinstead and J. Laurie Snell. Introduction to probability. American Mathematical Society, 1997.
Christoph Haase and Stefan Kiefer. The Odds of Staying on Budget. In Magnús M. Halldórsson, Kazuo Iwama, Naoki Kobayashi, and Bettina Speckmann, editors, Automata, Languages, and Programming - 42nd International Colloquium, ICALP 2015, Kyoto, Japan, July 6-10, 2015, Proceedings, Part II, volume 9135 of Lecture Notes in Computer Science, pages 234-246. Springer, 2015. URL: https://doi.org/10.1007/978-3-662-47666-6_19.
Arnd Hartmanns, Sebastian Junges, Joost-Pieter Katoen, and Tim Quatmann. Multi-cost Bounded Reachability in MDP. In Dirk Beyer and Marieke Huisman, editors, Tools and Algorithms for the Construction and Analysis of Systems - 24th International Conference, TACAS 2018, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2018, Thessaloniki, Greece, April 14-20, 2018, Proceedings, Part II, volume 10806 of Lecture Notes in Computer Science, pages 320-339. Springer, 2018. URL: https://doi.org/10.1007/978-3-319-89963-3_19.
Paul Hunter, Guillermo A. Pérez, and Jean-François Raskin. Looking at mean payoff through foggy windows. Acta Inf., 55(8):627-647, 2018. URL: https://doi.org/10.1007/s00236-017-0304-7.
Neil Immerman. Number of Quantifiers is Better Than Number of Tape Cells. J. Comput. Syst. Sci., 22(3):384-406, 1981. URL: https://doi.org/10.1016/0022-0000(81)90039-8.
Marcin Jurdzinski. Deciding the Winner in Parity Games is in UP ∩ co-UP. Inf. Process. Lett., 68(3):119-124, 1998. URL: https://doi.org/10.1016/S0020-0190(98)00150-1.
Mickael Randour. Automated Synthesis of Reliable and Efficient Systems Through Game Theory: A Case Study. In Proc. of ECCS 2012, Springer Proceedings in Complexity XVII, pages 731-738. Springer, 2013. URL: https://doi.org/10.1007/978-3-319-00395-5_90.
Mickael Randour, Jean-François Raskin, and Ocan Sankur. Variations on the Stochastic Shortest Path Problem. In Deepak D'Souza, Akash Lal, and Kim Guldstrand Larsen, editors, Verification, Model Checking, and Abstract Interpretation - 16th International Conference, VMCAI 2015, Mumbai, India, January 12-14, 2015. Proceedings, volume 8931 of Lecture Notes in Computer Science, pages 1-18. Springer, 2015. URL: https://doi.org/10.1007/978-3-662-46081-8_1.
Mickael Randour, Jean-François Raskin, and Ocan Sankur. Percentile queries in multi-dimensional Markov decision processes. Formal Methods in System Design, 50(2-3):207-248, 2017. URL: https://doi.org/10.1007/s10703-016-0262-7.
Stéphane Le Roux, Arno Pauly, and Mickael Randour. Extending Finite-Memory Determinacy by Boolean Combination of Winning Conditions. In Sumit Ganguly and Paritosh K. Pandya, editors, 38th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science, FSTTCS 2018, December 11-13, 2018, Ahmedabad, India, volume 122 of LIPIcs, pages 38:1-38:20. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 2018. URL: https://doi.org/10.4230/LIPIcs.FSTTCS.2018.38.
Stephen D. Travers. The complexity of membership problems for circuits over sets of integers. Theor. Comput. Sci., 369(1-3):211-229, 2006. URL: https://doi.org/10.1016/j.tcs.2006.08.017.
Moshe Y. Vardi. Automatic Verification of Probabilistic Concurrent Finite-State Programs. In Proc. of FOCS, pages 327-338. IEEE, 1985.

Life Is Random, Time Is Not: Markov Decision Processes with Window Objectives

Authors Thomas Brihaye, Florent Delgrange, Youssouf Oualhadj, Mickael Randour

File

Document Identifiers

Author Details

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

References

Thanks for your feedback!

Could not send message

Life Is Random, Time Is Not: Markov Decision Processes with Window Objectives

Authors Thomas Brihaye, Florent Delgrange, Youssouf Oualhadj, Mickael Randour

File

Document Identifiers

Author Details

Funding

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

Related Versions

References

Thanks for your feedback!

Could not send message