eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2010-12-14
65
72
10.4230/LIPIcs.FSTTCS.2010.65
article
Playing in stochastic environment: from multi-armed bandits to two-player games
Zielonka, Wieslaw
Given a zero-sum infinite game we examine the question if players have optimal memoryless deterministic strategies. It turns out that under some general conditions the problem for two-player games can be reduced to the same problem for one-player games which in turn can be reduced to a simpler related problem for multi-armed bandits.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol008-fsttcs2010/LIPIcs.FSTTCS.2010.65/LIPIcs.FSTTCS.2010.65.pdf
two-player zero-sum game
one-player zero-sum game
multi-armed bandit
memoryless deterministic strategy