The Value Problem for Multiple-Environment MDPs with Parity Objective

Chatterjee, Krishnendu; Doyen, Laurent; Raskin, Jean-François; Sankur, Ocan

doi:10.4230/LIPIcs.ICALP.2025.150

Abstract

We consider multiple-environment Markov decision processes (MEMDP), which consist of a finite set of MDPs over the same state space, representing different scenarios of transition structure and probability. The value of a strategy is the probability to satisfy the objective, here a parity objective, in the worst-case scenario, and the value of an MEMDP is the supremum of the values achievable by a strategy.
We show that deciding whether the value is 1 is a PSPACE-complete problem, and even in P when the number of environments is fixed, along with new insights to the almost-sure winning problem, which is to decide if there exists a strategy with value 1. Pure strategies are sufficient for theses problems, whereas randomization is necessary in general when the value is smaller than 1. We present an algorithm to approximate the value, running in double exponential space. Our results are in contrast to the related model of partially-observable MDPs where all these problems are known to be undecidable.

Cite As Get BibTex

Krishnendu Chatterjee, Laurent Doyen, Jean-François Raskin, and Ocan Sankur. The Value Problem for Multiple-Environment MDPs with Parity Objective. In 52nd International Colloquium on Automata, Languages, and Programming (ICALP 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 334, pp. 150:1-150:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025) https://doi.org/10.4230/LIPIcs.ICALP.2025.150

Author Details

Krishnendu Chatterjee

IST , Klosterneuburg, Austria

Laurent Doyen

CNRS & LMF, ENS Paris-Saclay, France

Jean-François Raskin

Université Libre de Bruxelles, Belgium

Ocan Sankur

Université de Rennes, CNRS, Inria, France
Mitsubishi Electric R&D Centre Europe, Rennes, France

Funding

Chatterjee, Krishnendu: ERC CoG 863818 (ForM-SMArt) and Austrian Science Fund (FWF) 10.55776/COE12.
Raskin, Jean-François: PDR Weave project FORM-LEARN-POMDP funded by FNRS and DFG, and the support of the Fondation ULB.
Sankur, Ocan: ANR BisoUS (ANR-22-CE48-0012) and ANR EpiRL (ANR-22-CE23-0029).

References

T. S. Badings, T. D. Simão, M. Suilen, and N. Jansen. Decision-making under uncertainty: beyond probabilities. Int. J. Softw. Tools Technol. Transf., 25(3):375-391, 2023. URL: https://doi.org/10.1007/S10009-023-00704-3.
C. Baier, M. Größer, and N. Bertrand. Probabilistic ω-automata. J. ACM, 59(1):1, 2012.
C. Baier and J.-P. Katoen. Principles of Model Checking. MIT Press, 2008.
P. Buchholz and D. Scheftelowitsch. Computation of weighted sums of rewards for concurrent MDPs. Math. Methods Oper. Res., 89(1):1-42, 2019. URL: https://doi.org/10.1007/S00186-018-0653-1.
J. Canny. Some algebraic and geometric computations in PSPACE. In Proc. of STOC: Symposium on Theory of Computing, pages 460-467. ACM, 1988. URL: https://doi.org/10.1145/62212.62257.
K. Chatterjee, M. Chmelík, D. Karkhanis, P. Novotný, and A. Royer. Multiple-environment Markov decision processes: Efficient analysis and applications. In Proc. of ICAPS: Automated Planning and Scheduling, pages 48-56. AAAI Press, 2020. URL: https://ojs.aaai.org/index.php/ICAPS/article/view/6644.
K. Chatterjee, L. Doyen, H. Gimbert, and T. A. Henzinger. Randomness for free. Information and Computation, 245:3-16, 2017.
K. Chatterjee, L. Doyen, J.-F. Raskin, and O. Sankur. The value problem for multiple-environment MDPs with parity objective. CoRR, abs/2504.15960, 2025.
C. Courcoubetis and M. Yannakakis. The complexity of probabilistic verification. J. ACM, 42(4):857-907, July 1995. URL: https://doi.org/10.1145/210332.210339.
L. de Alfaro. Formal verification of probabilistic systems. Ph.d. thesis, Stanford University, 1997.
S. Even, A. L. Selman, and Y. Yacobi. The complexity of promise problems with applications to public-key cryptography. Information and Control, 61(2):159-173, 1984. URL: https://doi.org/10.1016/S0019-9958(84)80056-X.
E. A. Feinberg and A. Shwartz, editors. Handbook of Markov Decision Processes - Methods and Applications. Kluwer, 2002.
H. Gimbert and Y. Oualhadj. Probabilistic automata on finite words: Decidable and undecidable problems. In Proc. of ICALP (2), LNCS 6199, pages 527-538. Springer, 2010. URL: https://doi.org/10.1007/978-3-642-14162-1_44.
O. Goldreich. On promise problems (a survey in memory of Shimon Even [1935-2004]). Manuscript, 2005.
W. Hoeffding. Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association, 58(301):13-30, 1963. URL: http://www.jstor.org/stable/2282952.
K. J. Åström. Optimal control of Markov processes with incomplete state information I. Journal of Mathematical Analysis and Applications, 10:174-205, 1965.
O. Madani, S. Hanks, and A. Condon. On the undecidability of probabilistic planning and related stochastic optimization problems. Artif. Intell., 147(1-2):5-34, 2003. URL: https://doi.org/10.1016/S0004-3702(02)00378-8.
D. A. Martin. The determinacy of Blackwell games. The Journal of Symbolic Logic, 63(4):1565-1581, 1998. URL: https://doi.org/10.2307/2586667.
C. H. Papadimitriou and J. N. Tsitsiklis. The complexity of Markov decision processes. Math. Oper. Res., 12(3):441-450, 1987. URL: https://doi.org/10.1287/MOOR.12.3.441.
M. L. Puterman. Markov decision processes. John Wiley and Sons, 1994.
N. M. van Dijk R. J. Boucherie. Markov decision processes in practice. Springer, 2017.
J.-F. Raskin and O. Sankur. Multiple-environment Markov decision processes. In Proc. of FSTTCS: Foundation of Software Technology and Theoretical Computer Science, volume 29 of LIPIcs, pages 531-543. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2014. URL: https://doi.org/10.4230/LIPICS.FSTTCS.2014.531.
M. Suilen, M. van der Vegt, and S. Junges. A PSPACE algorithm for almost-sure Rabin objectives in multi-environment MDPs. In Proc. of CONCUR: Concurrency Theory, volume 311 of LIPIcs, pages 40:1-40:17. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2024. URL: https://doi.org/10.4230/LIPICS.CONCUR.2024.40.
M. van der Vegt, N. Jansen, and S. Junges. Robust almost-sure reachability in multi-environment MDPs. In Proc. of TACAS: Tools and Algorithms for the Construction and Analysis of Systems, LNCS 13993, pages 508-526. Springer, 2023. URL: https://doi.org/10.1007/978-3-031-30823-9_26.

The Value Problem for Multiple-Environment MDPs with Parity Objective

Authors Krishnendu Chatterjee , Laurent Doyen , Jean-François Raskin , Ocan Sankur

Files

Document Identifiers

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

References

Thanks for your feedback!

Could not send message

The Value Problem for Multiple-Environment MDPs with Parity Objective

Authors Krishnendu Chatterjee , Laurent Doyen , Jean-François Raskin , Ocan Sankur

Files

Document Identifiers

Related Versions

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

Funding

References

Thanks for your feedback!

Could not send message