Entropic Risk for Turn-Based Stochastic Games

Authors Christel Baier , Krishnendu Chatterjee , Tobias Meggendorfer , Jakob Piribauer

Thumbnail PDF


  • Filesize: 0.78 MB
  • 16 pages

Document Identifiers

Author Details

Christel Baier
  • Technische Universität Dresden, Germany
Krishnendu Chatterjee
  • Institute of Science and Technology Austria (ISTA), Klosterneuburg, Austria
Tobias Meggendorfer
  • Institute of Science and Technology Austria (ISTA), Klosterneuburg, Austria
  • Technische Universität München, Germany
Jakob Piribauer
  • Technische Universität Dresden, Germany
  • Technische Universität München, Germany

Cite AsGet BibTex

Christel Baier, Krishnendu Chatterjee, Tobias Meggendorfer, and Jakob Piribauer. Entropic Risk for Turn-Based Stochastic Games. In 48th International Symposium on Mathematical Foundations of Computer Science (MFCS 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 272, pp. 15:1-15:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Entropic risk (ERisk) is an established risk measure in finance, quantifying risk by an exponential re-weighting of rewards. We study ERisk for the first time in the context of turn-based stochastic games with the total reward objective. This gives rise to an objective function that demands the control of systems in a risk-averse manner. We show that the resulting games are determined and, in particular, admit optimal memoryless deterministic strategies. This contrasts risk measures that previously have been considered in the special case of Markov decision processes and that require randomization and/or memory. We provide several results on the decidability and the computational complexity of the threshold problem, i.e. whether the optimal value of ERisk exceeds a given threshold. In the most general case, the problem is decidable subject to Shanuel’s conjecture. If all inputs are rational, the resulting threshold problem can be solved using algebraic numbers, leading to decidability via a polynomial-time reduction to the existential theory of the reals. Further restrictions on the encoding of the input allow the solution of the threshold problem in NP∩coNP. Finally, an approximation algorithm for the optimal value of ERisk is provided.

Subject Classification

ACM Subject Classification
  • Theory of computation → Logic and verification
  • Stochastic games
  • risk-aware verification


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. Ilan Adler and Peter A. Beling. Polynomial algorithms for linear programming over the algebraic numbers. Algorithmica, 12(6):436-457, 1994. Google Scholar
  2. Hubert Asienkiewicz and Anna Jaśkiewicz. A note on a new class of recursive utilities in Markov decision processes. Applicationes Mathematicae, 44:149-161, 2017. Google Scholar
  3. Christel Baier, Krishnendu Chatterjee, Tobias Meggendorfer, and Jakob Piribauer. Entropic risk for turn-based stochastic games, 2023. arXiv preprint. URL: https://arxiv.org/abs/2307.06611.
  4. Christel Baier and Joost-Pieter Katoen. Principles of model checking. MIT Press, 2008. Google Scholar
  5. Nicole Bäuerle and Ulrich Rieder. More risk-sensitive Markov decision processes. Mathematics of Operations Research, 39(1):105-120, 2014. Google Scholar
  6. Peter A. Beling. Exact algorithms for linear programming over algebraic extensions. Algorithmica, 31(4):459-478, 2001. Google Scholar
  7. Dimitri P. Bertsekas and John N. Tsitsiklis. An analysis of stochastic shortest path problems. Math. Oper. Res., 16(3):580-595, 1991. Google Scholar
  8. Dimitri P. Bertsekas and John N. Tsitsiklis. Neuro-dynamic programming, volume 3 of Optimization and neural computation series. Athena Scientific, 1996. URL: https://www.worldcat.org/oclc/35983505.
  9. Patrick Billingsley. Probability and measure. John Wiley & Sons, 2008. Google Scholar
  10. Mario Brandtner, Wolfgang Kürsten, and Robert Rischau. Entropic risk measures and their comparative statics in portfolio selection: Coherence vs. convexity. European Journal of Operational Research, 264(2):707-716, 2018. Google Scholar
  11. Tomás Brázdil, Krishnendu Chatterjee, Vojtech Forejt, and Antonín Kucera. Trading performance for stability in Markov decision processes. J. Comput. Syst. Sci., 84:144-170, 2017. Google Scholar
  12. Krishnendu Chatterjee. Robustness of structurally equivalent concurrent parity games. In International Conference on Foundations of Software Science and Computational Structures, pages 270-285. Springer, 2012. Google Scholar
  13. Krishnendu Chatterjee and Thomas A. Henzinger. Value iteration. In 25 Years of Model Checking, volume 5000 of Lecture Notes in Computer Science, pages 107-138. Springer, 2008. Google Scholar
  14. Krishnendu Chatterjee and Thomas A. Henzinger. A survey of stochastic ω-regular games. J. Comput. Syst. Sci., 78(2):394-413, 2012. Google Scholar
  15. Krishnendu Chatterjee, Koushik Sen, and Thomas A. Henzinger. Model-checking omega-regular properties of interval Markov chains. In FoSSaCS, volume 4962 of Lecture Notes in Computer Science, pages 302-317. Springer, 2008. Google Scholar
  16. Taolue Chen, Vojtech Forejt, Marta Z. Kwiatkowska, David Parker, and Aistis Simaitis. Automatic verification of competitive stochastic systems. Formal Methods Syst. Des., 43(1):61-92, 2013. URL: https://doi.org/10.1007/s10703-013-0183-7.
  17. E.J. Collins. Finite-horizon variance penalised Markov decision processes. Operations-Research-Spektrum, 19(1):35-39, 1997. Google Scholar
  18. Anne Condon. On algorithms for simple stochastic games. Advances in computational complexity theory, 13:51-72, 1990. Google Scholar
  19. Anne Condon. The complexity of stochastic games. Information and Computation, 96(2):203-224, 1992. Google Scholar
  20. Giovanni B Di Masi and Lukasz Stettner. Risk-sensitive control of discrete-time Markov processes with infinite horizon. SIAM Journal on Control and Optimization, 38(1):61-78, 1999. Google Scholar
  21. Jerzy Filar and Koos Vrieze. Competitive Markov decision processes. Springer Science & Business Media, 2012. Google Scholar
  22. Jerzy A. Filar, Lodewijk C. M. Kallenberg, and Huey-Miin Lee. Variance-penalized Markov decision processes. Math. Oper. Res., 14(1):147-161, 1989. Google Scholar
  23. Hans Föllmer and Thomas Knispel. Entropic risk measures: Coherence vs. convexity, model ambiguity and robust large deviations. Stochastics and Dynamics, 11(02n03):333-351, 2011. Google Scholar
  24. Hans Föllmer and Alexander Schied. Convex measures of risk and trading constraints. Finance and stochastics, 6(4):429-447, 2002. Google Scholar
  25. Hans Föllmer, Alexander Schied, and T Lyons. Stochastic finance. an introduction in discrete time. The Mathematical Intelligencer, 26(4):67-68, 2004. Google Scholar
  26. Vojtech Forejt, Marta Z. Kwiatkowska, Gethin Norman, and David Parker. Automated verification techniques for probabilistic systems. In SFM, volume 6659 of Lecture Notes in Computer Science, pages 53-113. Springer, 2011. Google Scholar
  27. Liping Fu and Larry R Rilett. Expected shortest paths in dynamic and stochastic traffic networks. Transportation Research Part B: Methodological, 32(7):499-516, 1998. Google Scholar
  28. Daniel T Gillespie. A general method for numerically simulating the stochastic time evolution of coupled chemical reactions. Journal of Computational Physics, 22(4):403-434, 1976. Google Scholar
  29. S. Gómez, A. Arenas, J. Borge-Holthoefer, S. Meloni, and Y. Moreno. Discrete-time Markov chain approach to contact-based disease spreading in complex networks. EPL (Europhysics Letters), 89(3), February 2010. Google Scholar
  30. Christoph Haase and Stefan Kiefer. The odds of staying on budget. In ICALP (2), volume 9135 of Lecture Notes in Computer Science, pages 234-246. Springer, 2015. Google Scholar
  31. Christoph Haase, Stefan Kiefer, and Markus Lohrey. Computing quantiles in Markov chains with multi-dimensional costs. In LICS, pages 1-12. IEEE Computer Society, 2017. Google Scholar
  32. Serge Haddad and Benjamin Monmege. Interval iteration algorithm for mdps and imdps. Theor. Comput. Sci., 735:111-131, 2018. Google Scholar
  33. Ronald A Howard and James E Matheson. Risk-sensitive Markov decision processes. Management science, 18(7):356-369, 1972. Google Scholar
  34. Stratton C Jaquette. A utility criterion for Markov decision processes. Management Science, 23(1):43-49, 1976. Google Scholar
  35. Jan Kretínský and Tobias Meggendorfer. Conditional value-at-risk for reachability and mean payoff in Markov decision processes. In LICS, pages 609-618. ACM, 2018. Google Scholar
  36. Jan Kretínský, Tobias Meggendorfer, and Maximilian Weininger. Stopping criteria for value iteration on stochastic games with quantitative objectives. In LICS, 2023. Google Scholar
  37. Serge Lang. Introduction to transcendental numbers. Addison-Wesley Publishing Company, 1966. Google Scholar
  38. Angus Macintyre and Alex J. Wilkie. On the decidability of the real exponential field. In Piergiorgio Odifreddi, editor, Kreiseliana. About and Around Georg Kreisel, pages 441-467. A K Peters, 1996. Google Scholar
  39. A Maitra and W Sudderth. Stochastic games with borel payoffs. In Stochastic Games and Applications, pages 367-373. Springer, 2003. Google Scholar
  40. Shie Mannor and John N. Tsitsiklis. Mean-variance optimization in Markov decision processes. In ICML, pages 177-184. Omnipress, 2011. Google Scholar
  41. Donald A Martin. Borel determinacy. Annals of Mathematics, 102(2):363-371, 1975. Google Scholar
  42. Tobias Meggendorfer. Risk-aware stochastic shortest path. In AAAI, pages 9858-9867. AAAI Press, 2022. Google Scholar
  43. Johan Paulsson. Summing up the noise in gene networks. Nature, 427(6973):415-418, 2004. Google Scholar
  44. Jakob Piribauer and Christel Baier. On skolem-hardness and saturation points in Markov decision processes. In ICALP, volume 168 of LIPIcs, pages 138:1-138:17. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2020. Google Scholar
  45. Jakob Piribauer, Ocan Sankur, and Christel Baier. The variance-penalized stochastic shortest path problem. In ICALP, volume 229 of LIPIcs, pages 129:1-129:19. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2022. Google Scholar
  46. Martin L. Puterman. Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley Series in Probability and Statistics. Wiley, 1994. Google Scholar
  47. Mickael Randour, Jean-François Raskin, and Ocan Sankur. Variations on the stochastic shortest path problem. In VMCAI, volume 8931 of Lecture Notes in Computer Science, pages 1-18. Springer, 2015. Google Scholar
  48. Mickael Randour, Jean-François Raskin, and Ocan Sankur. Percentile queries in multi-dimensional Markov decision processes. Formal Methods Syst. Des., 50(2-3):207-248, 2017. URL: https://doi.org/10.1007/s10703-016-0262-7.
  49. Lloyd S Shapley. Stochastic games. Proceedings of the national academy of sciences, 39(10):1095-1100, 1953. Google Scholar
  50. Florent Teichteil-Königsbuch, Ugur Kuter, and Guillaume Infantes. Incremental plan aggregation for generating policies in mdps. In AAMAS, pages 1231-1238. IFAAMAS, 2010. Google Scholar
  51. Michael Ummels and Christel Baier. Computing quantiles in Markov reward models. In FoSSaCS, volume 7794 of Lecture Notes in Computer Science, pages 353-368. Springer, 2013. Google Scholar
  52. Lou van den Dries, Angus Macintyre, and David Marker. The elementary theory of restricted analytic fields with exponentiation. Annals of Mathematics, 140(1):183-205, 1994. Google Scholar
  53. Maximilian Weininger, Tobias Meggendorfer, and Jan Kretínský. Satisfiability bounds for ω-regular properties in bounded-parameter Markov decision processes. In CDC, pages 2284-2291. IEEE, 2019. Google Scholar