The Operator Approach to Entropy Games

Authors Marianne Akian, Stéphane Gaubert, Julien Grand-Clément, Jérémie Guillaud



PDF
Thumbnail PDF

File

LIPIcs.STACS.2017.6.pdf
  • Filesize: 0.6 MB
  • 14 pages

Document Identifiers

Author Details

Marianne Akian
Stéphane Gaubert
Julien Grand-Clément
Jérémie Guillaud

Cite As Get BibTex

Marianne Akian, Stéphane Gaubert, Julien Grand-Clément, and Jérémie Guillaud. The Operator Approach to Entropy Games. In 34th Symposium on Theoretical Aspects of Computer Science (STACS 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 66, pp. 6:1-6:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017) https://doi.org/10.4230/LIPIcs.STACS.2017.6

Abstract

Entropy games and matrix multiplication games have been recently introduced by Asarin et al. They model the situation in which one player (Despot) wishes to minimize the growth rate of a matrix product, whereas the other player (Tribune) wishes to maximize it. We develop an operator approach to entropy games. This allows us to show that entropy games can be cast as stochastic mean payoff games in which some action spaces are simplices and payments are given by a relative entropy (Kullback-Leibler divergence). In this way, we show that entropy games with a fixed number of states belonging to Despot can be solved in polynomial time. This approach also allows us to solve these games by a policy iteration algorithm, which we compare with the spectral simplex algorithm developed by Protasov.

Subject Classification

Keywords
  • Stochastic games
  • Shapley operators
  • policy iteration
  • Perron eigenvalues
  • Risk sensitive control

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. M. Akian, S. Gaubert, and A. Guterman. Tropical polyhedra are equivalent to mean payoff games. International Journal of Algebra and Computation, 22(1):125001 (43 pages), 2012. http://arxiv.org/abs/0912.2462, URL: http://dx.doi.org/10.1142/S0218196711006674.
  2. M. Akian, S. Gaubert, and R. Nussbaum. A Collatz-Wielandt characterization of the spectral radius of order-preserving homogeneous maps on cones, 2011. URL: https://arxiv.org/abs/1112.5968.
  3. V. Anantharam and V. S. Borkar. A variational formula for risk-sensitive reward, 2015. arXiv:1501.00676. Google Scholar
  4. D. Andersson and P. B. Miltersen. The complexity of solving stochastic games on graphs. In Proceedings of ISAAC'09, number 5878 in LNCS. Springer, 2009. Google Scholar
  5. E. Asarin, J. Cervelle, A. Degorre, C. Dima, F. Horn, and V. Kozyakin. Entropy games and matrix multiplication games. In 33rd Symposium on Theoretical Aspects of Computer Science, STACS 2016, February 17-20, 2016, Orléans, France, pages 11:1-11:14, 2016. URL: http://dx.doi.org/10.4230/LIPIcs.STACS.2016.11.
  6. A. Berman and R. J. Plemmons. Nonnegative matrices in the mathematical sciences. Academic Press, 1994. Google Scholar
  7. V. D. Blondel and Y. Nesterov. Polynomial-time computation of the joint spectral radius for some sets of nonnegative matrices. SIAM J. Matrix Anal., 31(3):865-876, 2009. Google Scholar
  8. J. Bolte, S. Gaubert, and G. Vigeral. Definable zero-sum stochastic games. Mathematics of Operations Research, 40(1):171-191, 2014. http://arxiv.org/abs/1301.1967, URL: http://dx.doi.org/10.1287/moor.2014.0666.
  9. J. M. Borwein and P. B. Borwein. On the complexity of familiar functions and numbers. SIAM Review, 30(4):589-601, 1988. Google Scholar
  10. T. Chen and T. Han. On the complexity of computing maximum entropy for markovian models. In 34th International Conference on Foundation of Software Technology and Theoretical Computer Science, FSTTCS 2014, December 15-17, 2014, New Delhi, India, pages 571-583, 2014. URL: http://dx.doi.org/10.4230/LIPIcs.FSTTCS.2014.571.
  11. M. D. Donsker and S. R. S. Varadhan. On a variational formula for the principal eigenvalue for operators with maximum principle. Proc. Nat. Acad. Sci. USA, 72(3):780-783, 1975. Google Scholar
  12. L. van den Dries. Tame topology and o-minimal structures, volume 248 of London Mathematical Society Lecture Note Series. Cambridge University Press, Cambridge, 1998. URL: http://dx.doi.org/10.1017/CBO9780511525919.
  13. W. H. Fleming and D. Hernández-Hernández. Risk-sensitive control of finite state machines on an infinite horizon. I. SIAM J. Control Optim., 35(5):1790-1810, 1997. URL: http://dx.doi.org/10.1137/S0363012995291622.
  14. W. H. Fleming and D. Hernández-Hernández. Risk-sensitive control of finite state machines on an infinite horizon. II. SIAM J. Control Optim., 37(4):1048-1069 (electronic), 1999. URL: http://dx.doi.org/10.1137/S0363012997321498.
  15. S. Gaubert and J. Gunawardena. A non-linear hierarchy for discrete event dynamical systems. In Proc. of the Fourth Workshop on Discrete Event Systems (WODES98), pages 249-254, Cagliari, Italy, 1998. IEE. Google Scholar
  16. S. Gaubert and J. Gunawardena. The Perron-Frobenius theorem for homogeneous, monotone functions. Trans. of AMS, 356(12):4931-4950, 2004. http://arxiv.org/abs/math.FA/0105091, URL: http://dx.doi.org/10.1090/S0002-9947-04-03470-1.
  17. M. Grötschel, L. Lovász, and A. Schrijver. The ellipsoid method and its consequences in combinatorial optimization. Combinatorica, 1(2):169-197, 1981. Google Scholar
  18. T. D. Hansen, P. B. Miltersen, and U. Zwick. Strategy iteration is strongly polynomial for 2-player turn-based stochastic games with a constant discount factor. In Innovations in Computer Science 2011, pages 253-263. Tsinghua University Press, 2011. Google Scholar
  19. A. J. Hoffman and R. M. Karp. On nonterminating stochastic games. Management Science. Journal of the Institute of Management Science. Application and Theory Series, 12:359-370, 1966. Google Scholar
  20. V. Kozyakin. Hourglass alternative and the finiteness conjecture for the spectral characteristics of sets of non-negative matrices. arXiv:1507.00492, 2015. Google Scholar
  21. M. Lothaire. Applied Combinatorics on Words. Cambridge, 2005. Google Scholar
  22. R. D. Nussbaum. Convexity and log convexity for the spectral radius. Linear Algebra Appl., 73:59-122, 1986. Google Scholar
  23. V. Yu. Protasov. Spectral simplex method. Mathematical Programming, 2015. URL: http://dx.doi.org/10.1007/s10107-015-0905-2.
  24. M. L. Puterman. Markov Decision Processes. Wiley, 2005. Google Scholar
  25. R. T. Rockafellar and R. J.-B. Wets. Variational analysis. Springer-Verlag, Berlin, 1998. URL: http://dx.doi.org/10.1007/978-3-642-02431-3.
  26. U. G. Rothblum. Multiplicative markov decision chains. Mathematics of Operations Research, 9(1):6-24, 1984. Google Scholar
  27. S. M. Rump. Polynomial minimum root separation. Mathematics of Computation, 145(33):327-336, 1979. Google Scholar
  28. L. van den Dries. o-minimal structures and real analytic geometry. In Current developments in mathematics, 1998 (Cambridge, MA), pages 105-152. Int. Press, Somerville, MA, 1999. Google Scholar
  29. A. J. Wilkie. Model completeness results for expansions of the ordered field of real numbers by restricted Pfaffian functions and the exponential function. J. Amer. Math. Soc., 9(4):1051-1094, 1996. URL: http://dx.doi.org/10.1090/S0894-0347-96-00216-0.
  30. Y. Ye. The simplex and policy-iteration methods are strongly polynomial for the markov decision problem with a fixed discount rate, 2011. URL: http://dx.doi.org/10.1287/moor.1110.0516.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail