Solving Irreducible Stochastic Mean-Payoff Games and Entropy Games by Relative Krasnoselskii-Mann Iteration

Authors Marianne Akian, Stéphane Gaubert, Ulysse Naepels, Basile Terver



PDF
Thumbnail PDF

File

LIPIcs.MFCS.2023.10.pdf
  • Filesize: 0.75 MB
  • 15 pages

Document Identifiers

Author Details

Marianne Akian
  • INRIA and CMAP, École polytechnique, IP Paris, CNRS, France
Stéphane Gaubert
  • INRIA and CMAP, École polytechnique, IP Paris, CNRS, France
Ulysse Naepels
  • École polytechnique, IP Paris, France
Basile Terver
  • École polytechnique, IP Paris, France

Acknowledgements

We thank the reviewers for helpful comments.

Cite AsGet BibTex

Marianne Akian, Stéphane Gaubert, Ulysse Naepels, and Basile Terver. Solving Irreducible Stochastic Mean-Payoff Games and Entropy Games by Relative Krasnoselskii-Mann Iteration. In 48th International Symposium on Mathematical Foundations of Computer Science (MFCS 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 272, pp. 10:1-10:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)
https://doi.org/10.4230/LIPIcs.MFCS.2023.10

Abstract

We analyse an algorithm solving stochastic mean-payoff games, combining the ideas of relative value iteration and of Krasnoselskii-Mann damping. We derive parameterized complexity bounds for several classes of games satisfying irreducibility conditions. We show in particular that an ε-approximation of the value of an irreducible concurrent stochastic game can be computed in a number of iterations in O(|log(ε)|) where the constant in the O(⋅) is explicit, depending on the smallest non-zero transition probabilities. This should be compared with a bound in O(ε^{-1}|log(ε)|) obtained by Chatterjee and Ibsen-Jensen (ICALP 2014) for the same class of games, and to a O(ε^{-1}) bound by Allamigeon, Gaubert, Katz and Skomra (ICALP 2022) for turn-based games. We also establish parameterized complexity bounds for entropy games, a class of matrix multiplication games introduced by Asarin, Cervelle, Degorre, Dima, Horn and Kozyakin. We derive these results by methods of variational analysis, establishing contraction properties of the relative Krasnoselskii-Mann iteration with respect to Hilbert’s semi-norm.

Subject Classification

ACM Subject Classification
  • Theory of computation → Algorithmic game theory
Keywords
  • Stochastic mean-payoff games
  • concurrent games
  • entropy games
  • relative value iteration
  • Krasnoselskii-Mann fixed point algorithm
  • Hilbert projective metric

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. M. Akian, S. Gaubert, J. Grand-Clément, and J. Guillaud. The operator approach to entropy games. Theory of Computing Systems, 63:1089-1130, 2019. Google Scholar
  2. M. Akian, S. Gaubert, and A. Hochart. Ergodicity conditions for zero-sum games. Discrete Contin. Dyn. Syst., 35(9):3901-3931, 2015. Google Scholar
  3. M. Akian, A. Sulem, and M. I. Taksar. Dynamic optimization of long-term growth rate for a portfolio with transaction costs and logarithmic utility. Mathematical Finance, 11(2):153-188, April 2001. Google Scholar
  4. Marianne Akian, Stéphane Gaubert, Ulysse Naepels, and Basile Terver. Solving irreducible stochastic mean-payoff games and entropy games by relative Krasnoselskii-Mann iteration, 2023. Extended version of the present article, arXiv:2305.02458. Google Scholar
  5. X. Allamigeon, S. Gaubert, R. D. Katz, and M. Skomra. Universal Complexity Bounds Based on Value Iteration and Application to Entropy Games. In Mikołaj Bojańczyk, Emanuela Merelli, and David P. Woodruff, editors, 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022), volume 229 of Leibniz International Proceedings in Informatics (LIPIcs), pages 110:1-110:20, Dagstuhl, Germany, 2022. Schloss Dagstuhl - Leibniz-Zentrum für Informatik. Google Scholar
  6. V. Anantharam and V. S. Borkar. A variational formula for risk-sensitive reward. SIAM J. Contro. Optim., 55(2):961-988, 2017. Google Scholar
  7. D. Andersson and P. B. Miltersen. The complexity of solving stochastic games on graphs. In Proceedings of the 20th International Symposium on Algorithms and Computation (ISAAC), volume 5878 of Lecture Notes in Comput. Sci., pages 112-121. Springer, 2009. Google Scholar
  8. E. Asarin, J. Cervelle, A. Degorre, C. Dima, F. Horn, and V. Kozyakin. Entropy games and matrix multiplication games. In Proceedings of the 33rd International Symposium on Theoretical Aspects of Computer Science (STACS), volume 47 of LIPIcs. Leibniz Int. Proc. Inform., pages 11:1-11:14, Wadern, 2016. Schloss Dagstuhl-Leibniz-Zentrum für Informatik. Google Scholar
  9. L. Attia and M. Oliu-Barton. A formula for the value of a stochastic game. PNAS, 52(116):26435-26443, 2019. Google Scholar
  10. J. B. Baillon and R. E. Bruck. Optimal rates of asymptotic regularity for averaged nonexpansive mappings. In K. K. Tan, editor, Proceedings of the Second International Conference on Fixed Point Theory and Applications, pages 27-66. World Scientific Press, 1992. Google Scholar
  11. T. Bewley and E. Kohlberg. The asymptotic theory of stochastic games. Math. Oper. Res., 1(3):197-208, 1976. Google Scholar
  12. E. Boros, K. Elbassioni, V. Gurvich, and K. Makino. A potential reduction algorithm for two-person zero-sum mean payoff stochastic games. Dynamic Games and Applications, 8(1):22-41, July 2018. Google Scholar
  13. K. Chatterjee and R. Ibsen-Jensen. The complexity of ergodic mean-payoff games. Extended version of a paper published in the proceedings of ICALP, 2014. URL: https://arxiv.org/abs/1404.5734.
  14. A. Condon. The complexity of stochastic games. Inform. and Comput., 96(2):203-224, 1992. Google Scholar
  15. R. L. Dobrushin. Central limit theorem for nonstationary Markov chains. I. Theory of Probability & Its Applications, 1(1):65-80, January 1956. Google Scholar
  16. K. Etessami and M. Yannakakis. Recursive concurrent stochastic games. Logical Methods in Computer Science, 4(4), November 2008. Google Scholar
  17. A Federgruen, P.J Schweitzer, and H.C Tijms. Contraction mappings underlying undiscounted Markov decision problems. Journal of Mathematical Analysis and Applications, 65(3):711-730, 1978. Google Scholar
  18. S. Gaubert and J. Gunawardena. The Perron-Frobenius theorem for homogeneous, monotone functions. Trans. of AMS, 356(12):4931-4950, 2004. Google Scholar
  19. S. Gaubert and N. Stott. A convergent hierarchy of non-linear eigenproblems to compute the joint spectral radius of nonnegative matrices. Mathematical Control and Related Fields, 10(3):573-590, 2020. Google Scholar
  20. D. Gillette. Stochastic games with zero stop probabilities, volume III, chapter 9, pages 179-188. Princeton University Press, 1958. Google Scholar
  21. K. Arnsfelt Hansen, M. Koucky, N. Lauritzen, P. Bro Miltersen, and E. P. Tsigaridas. Exact algorithms for solving stochastic games. In STOC 2011, 2011. Google Scholar
  22. A. J. Hoffman and R. M. Karp. On nonterminating stochastic games. Manag. Sci., 12(5):359-370, 1966. Google Scholar
  23. R. A. Howard and J. E. Matheson. Risk-sensitive Markov decision processes. Management Science, 18(7):356-369, 1972. Google Scholar
  24. S. Ishikawa. Fixed points and iteration of a nonexpansive mapping in a Banach space. Proceedings of the American Mathematical Society, 59(1):65-71, 1976. Google Scholar
  25. M. A. Krasnosel’skiĭ. Two remarks on the method of successive approximations. Uspekhi Matematicheskikh Nauk, 10:123-127, 1955. Google Scholar
  26. T. M. Liggett and S. A. Lippman. Stochastic games with perfect information and time average payoff. SIAM Rev., 11:604-607, 1969. Google Scholar
  27. W. R. Mann. Mean value methods in iteration. Proceedings of the American Mathematical Society, 4:506-510, 1953. Google Scholar
  28. J.-F. Mertens and A. Neyman. Stochastic games. Internat. J. Game Theory, 10(2):53-66, 1981. Google Scholar
  29. J.-F. Mertens, S. Sorin, and S. Zamir. Repeated games, volume 55 of Econom. Soc. Monogr. Cambridge University Press, Cambridge, 2015. Google Scholar
  30. H.D. Mills. Marginal values of matrix games and linear programs. In H. W. Kuhn and A. W. Tucker, editors, Linear Inequalities and Related Systems, volume 38 of Annals of Mathematics Studies, pages 183-194. Princeton University Press, 1956. Google Scholar
  31. D. Rosenberg and S. Sorin. An operator approach to zero-sum repeated games. Israel J. Math., 121(1):221-246, 2001. Google Scholar
  32. U. G. Rothblum. Multiplicative Markov decision chains. Mathematics of Operations Research, 9(1):6-24, 1984. Google Scholar
  33. U. G. Rothblum and P. Whittle. Growth optimality for branching Markov decision chains. Mathematics of Operations Research, 7(4):582-601, 1982. Google Scholar
  34. L. S. Shapley. Stochastic games. Proc. Natl. Acad. Sci. USA, 39(10):1095-1100, 1953. Google Scholar
  35. M. Skomra. Optimal bounds for bit-sizes of stationary distributions in finite Markov chains. Preprint arxiv:2109.04976, 2021. Google Scholar
  36. K. Sladký. On dynamic programming recursions for multiplicative Markov decision chains, pages 216-226. Springer Berlin Heidelberg, Berlin, Heidelberg, 1976. Google Scholar
  37. G. Vigeral. A zero-sum stochastic game with compact action sets and no asymptotic value. Dynamic Games and Applications, 3(2):172-186, January 2013. Google Scholar
  38. D.J White. Dynamic programming, Markov chains, and the method of successive approximations. Journal of Mathematical Analysis and Applications, 6(3):373-376, 1963. Google Scholar
  39. W. H. M. Zijm. Asymptotic expansions for dynamic programming recursions with general nonnegative matrices. J. Optim. Theory Appl., 54(1):157-191, 1987. Google Scholar
  40. U. Zwick and M. Paterson. The complexity of mean payoff games on graphs. Theoret. Comput. Sci., 158(1-2):343-359, 1996. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail