,
Mateusz Skomra
Creative Commons Attribution 4.0 International license
We devise a policy-iteration algorithm for deterministic two-player discounted and mean-payoff games, that runs in polynomial time with high probability, on any input where each payoff is chosen independently from a sufficiently random distribution and the underlying graph of the game is ergodic. This includes the case where an arbitrary set of payoffs has been perturbed by a Gaussian, showing for the first time that deterministic two-player games can be solved efficiently, in the sense of smoothed analysis. More generally, we devise a condition number for deterministic discounted and mean-payoff games played on ergodic graphs, and show that our algorithm runs in time polynomial in this condition number. Our result confirms a previous conjecture of Boros et al., which was claimed as a theorem [Boros et al., 2011] and later retracted [Boros et al., 2018]. It stands in contrast with a recent counter-example by Christ and Yannakakis [Christ and Yannakakis, 2023], showing that Howard’s policy-iteration algorithm does not run in smoothed polynomial time on stochastic single-player mean-payoff games. Our approach is inspired by the analysis of random optimal assignment instances by Frieze and Sorkin [Frieze and Sorkin, 2007], and the analysis of bias-induced policies for mean-payoff games by Akian, Gaubert and Hochart [Akian et al., 2018].
@InProceedings{loff_et_al:LIPIcs.ICALP.2024.147,
author = {Loff, Bruno and Skomra, Mateusz},
title = {{Smoothed Analysis of Deterministic Discounted and Mean-Payoff Games}},
booktitle = {51st International Colloquium on Automata, Languages, and Programming (ICALP 2024)},
pages = {147:1--147:16},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {978-3-95977-322-5},
ISSN = {1868-8969},
year = {2024},
volume = {297},
editor = {Bringmann, Karl and Grohe, Martin and Puppis, Gabriele and Svensson, Ola},
publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
address = {Dagstuhl, Germany},
URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2024.147},
URN = {urn:nbn:de:0030-drops-202908},
doi = {10.4230/LIPIcs.ICALP.2024.147},
annote = {Keywords: Mean-payoff games, discounted games, policy iteration, smoothed analysis}
}