The Variance-Penalized Stochastic Shortest Path Problem

Piribauer, Jakob; Sankur, Ocan; Baier, Christel

doi:10.4230/LIPIcs.ICALP.2022.129

File

Subject Classification

ACM Subject Classification

Theory of computation → Verification by model checking

Keywords

Markov decision process
variance
stochastic shortest path problem

Metrics

Access Statistics
Total Accesses (updated on a weekly basis)

0

PDF Downloads

0

Metadata Views

Abstract

The stochastic shortest path problem (SSPP) asks to resolve the non-deterministic choices in a Markov decision process (MDP) such that the expected accumulated weight before reaching a target state is maximized. This paper addresses the optimization of the variance-penalized expectation (VPE) of the accumulated weight, which is a variant of the SSPP in which a multiple of the variance of accumulated weights is incurred as a penalty. It is shown that the optimal VPE in MDPs with non-negative weights as well as an optimal deterministic finite-memory scheduler can be computed in exponential space. The threshold problem whether the maximal VPE exceeds a given rational is shown to be EXPTIME-hard and to lie in NEXPTIME. Furthermore, a result of interest in its own right obtained on the way is that a variance-minimal scheduler among all expectation-optimal schedulers can be computed in polynomial time.

Cite As Get BibTex

Jakob Piribauer, Ocan Sankur, and Christel Baier. The Variance-Penalized Stochastic Shortest Path Problem. In 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 229, pp. 129:1-129:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022) https://doi.org/10.4230/LIPIcs.ICALP.2022.129

Author Details

Jakob Piribauer

Technische Universität Dresden, Germany

Ocan Sankur

Univ Rennes, Inria, CNRS, IRISA, France

Christel Baier

Technische Universität Dresden, Germany

References

Mohamadreza Ahmadi, Anushri Dixit, Joel W Burdick, and Aaron D Ames. Risk-averse stochastic shortest path planning. arXiv, 2021. URL: http://arxiv.org/abs/2103.14727.
Kenneth J. Arrow. Essays in the Theory of Risk-Bearing. Amsterdam, North-Holland Pub. Co., 1970.
Christel Baier, Nathalie Bertrand, Clemens Dubslaff, Daniel Gburek, and Ocan Sankur. Stochastic shortest paths and weight-bounded properties in Markov decision processes. In 33rd Annual ACM/IEEE Symposium on Logic in Computer Science (LICS), pages 86-94. ACM, 2018.
Christel Baier, Joachim Klein, Sascha Klüppelholz, and Sascha Wunderlich. Maximizing the conditional expected reward for reaching the goal. In Axel Legay and Tiziana Margaria, editors, 23rd International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS), volume 10206 of Lecture Notes in Computer Science, pages 269-285. Springer, 2017.
Dimitri P. Bertsekas and John N. Tsitsiklis. An analysis of stochastic shortest path problems. Mathematics of Operations Research, 16(3):580-595, 1991.
Tomáš Brázdil, Krishnendu Chatterjee, Vojtěch Forejt, and Antonín Kučera. Trading performance for stability in Markov decision processes. Journal of Computer and System Sciences, 84:144-170, 2017. URL: https://doi.org/10.1016/j.jcss.2016.09.009.
EJ Collins. Finite-horizon variance penalised Markov decision processes. Operations-Research-Spektrum, 19(1):35-39, 1997.
Luca de Alfaro. Computing minimum and maximum reachability times in probabilistic systems. In 10th International Conference on Concurrency Theory (CONCUR), volume 1664 of Lecture Notes in Computer Science, pages 66-81, 1999.
Jerzy A Filar, Lodewijk CM Kallenberg, and Huey-Miin Lee. Variance-penalized Markov decision processes. Mathematics of Operations Research, 14(1):147-161, 1989.
William N Goetzmann, Stephen J Brown, Martin J Gruber, and Edwin J Elton. Modern portfolio theory and investment analysis. John Wiley & Sons, 2014.
Christoph Haase and Stefan Kiefer. The odds of staying on budget. In 42nd International Colloquium on Automata, Languages, and Programming (ICALP), volume 9135 of Lecture Notes in Computer Science, pages 234-246. Springer, 2015.
Stratton C. Jaquette. Markov Decision Processes with a New Optimality Criterion: Discrete Time. The Annals of Statistics, 1(3):496-505, 1973. URL: https://doi.org/10.1214/aos/1176342415.
Lodewijk Kallenberg. Markov Decision Processes. Lecture Notes. University of Leiden, 2011.
Jan Kretínský and Tobias Meggendorfer. Conditional value-at-risk for reachability and mean payoff in Markov decision processes. In 33rd Annual ACM/IEEE Symposium on Logic in Computer Science (LICS), pages 609-618. ACM, 2018. URL: https://doi.org/10.1145/3209108.3209176.
Masami Kurano. Markov decision processes with a minimum-variance criterion. Journal of mathematical analysis and applications, 123(2):572-583, 1987.
Petr Mandl. On the variance in controlled Markov chains. Kybernetika, 7(1):1-12, 1971.
Shie Mannor and John N. Tsitsiklis. Mean-variance optimization in Markov decision processes. In Proceedings of the 28th International Conference on Machine Learning, ICML'11, pages 177-184, Madison, WI, USA, 2011. Omnipress.
Harry Markowitz. Portfolio selection. The Journal of Finance, 7(1):77-91, 1952. URL: http://www.jstor.org/stable/2975974.
Jakob Piribauer. On Non-Classical Stochastic Shortest Path Problems. PhD thesis, Technische Universität Dresden, Germany, 2021.
Jakob Piribauer and Christel Baier. Partial and conditional expectations in Markov decision processes with integer weights. In Mikolaj Bojanczyk and Alex Simpson, editors, 22nd International Conference on Foundations of Software Science and Computation Structures (FoSSaCS), volume 11425 of Lecture Notes in Computer Science, pages 436-452. Springer, 2019.
Jakob Piribauer and Christel Baier. On Skolem-hardness and saturation points in Markov decision processes. In Artur Czumaj, Anuj Dawar, and Emanuela Merelli, editors, 47th International Colloquium on Automata, Languages, and Programming (ICALP), volume 168 of LIPIcs, pages 138:1-138:17. Schloss Dagstuhl-Leibniz-Zentrum für Informatik, 2020. URL: https://doi.org/10.4230/LIPIcs.ICALP.2020.138.
Jakob Piribauer, Ocan Sankur, and Christel Baier. The variance-penalized stochastic shortest path problem. arXiv:2204.12280, 2022. URL: https://doi.org/10.48550/ARXIV.2204.12280.
John W. Pratt. Risk aversion in the small and in the large. Econometrica, 32(1/2):122-136, 1964. URL: http://www.jstor.org/stable/1913738.
Martin L. Puterman. Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons, 1994.
Matthew J. Sobel. The variance of discounted markov decision processes. Journal of Applied Probability, 19(4):794-802, 1982. URL: https://doi.org/10.2307/3213832.
Matthew J. Sobel. Mean-variance tradeoffs in an undiscounted mdp. Operations Research, 42(1):175-183, 1994. URL: https://doi.org/10.1287/opre.42.1.175.
Michael Ummels and Christel Baier. Computing quantiles in Markov reward models. In Frank Pfenning, editor, 16th International Conference on Foundations of Software Science and Computation Structures (FoSSaCS), volume 7794 of Lecture Notes in Computer Science, pages 353-368. Springer, 2013. URL: https://doi.org/10.1007/978-3-642-37075-5_23.
Tom Verhoeff. Reward variance in Markov chains: A calculational approach. In Proceedings of Eindhoven FASTAR Days. Citeseer, 2004.
Xiao Wu and Xianping Guo. First passage optimality and variance minimisation of Markov decision processes with varying discount factors. Journal of Applied Probability, 52(2):441-456, 2015. URL: https://doi.org/10.1239/jap/1437658608.
Li Xia. Optimization of Markov decision processes under the variance criterion. Automatica, 73:269-278, 2016. URL: https://doi.org/10.1016/j.automatica.2016.06.018.
Li Xia. Mean-variance optimization of discrete time discounted Markov decision processes. Automatica, 88:76-82, 2018.
Li Xia. Variance minimization of parameterized Markov decision processes. Discrete Event Dynamic Systems, 28(1):63-81, 2018. URL: https://doi.org/10.1007/s10626-017-0258-5.
Li Xia. Risk-sensitive Markov decision processes with combined metrics of mean and variance. Production and Operations Management, 29(12):2808-2827, 2020.

The Variance-Penalized Stochastic Shortest Path Problem

Authors Jakob Piribauer , Ocan Sankur , Christel Baier

File

Document Identifiers

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

References

Thanks for your feedback!

Could not send message

The Variance-Penalized Stochastic Shortest Path Problem

Authors Jakob Piribauer , Ocan Sankur , Christel Baier

File

Document Identifiers

Related Versions

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

Funding

References

Thanks for your feedback!

Could not send message