2010 | OriginalPaper | Chapter
A Pumping Algorithm for Ergodic Stochastic Mean Payoff Games with Perfect Information
Authors : Endre Boros, Khaled Elbassioni, Vladimir Gurvich, Kazuhisa Makino
Published in: Integer Programming and Combinatorial Optimization
Publisher: Springer Berlin Heidelberg
Activate our intelligent search to find suitable subject content or patents.
Select sections of text to find matching patents with Artificial Intelligence. powered by
Select sections of text to find additional relevant content using AI-assisted search. powered by
In this paper, we consider two-person zero-sum stochastic mean payoff games with perfect information, or BWR-games, given by a digraph
G
= (
V
=
V
B
∪
V
W
∪
V
R
,
E
), with local rewards
$r: E \to {\mathbb R}$
, and three types of vertices: black
V
B
, white
V
W
, and random
V
R
. The game is played by two players, White and Black: When the play is at a white (black) vertex
v
, White (Black) selects an outgoing arc (
v
,
u
). When the play is at a random vertex
v
, a vertex
u
is picked with the given probability
p
(
v
,
u
). In all cases, Black pays White the value
r
(
v
,
u
). The play continues forever, and White aims to maximize (Black aims to minimize) the limiting mean (that is, average) payoff. It was recently shown in [7] that BWR-games are polynomially equivalent with the classical Gillette games, which include many well-known subclasses, such as cyclic games, simple stochastic games (SSG′s), stochastic parity games, and Markov decision processes. In this paper, we give a new algorithm for solving BWR-games in the
ergodic case
, that is when the optimal values do not depend on the initial position. Our algorithm solves a BWR-game by reducing it, using a potential transformation, to a canonical form in which the optimal strategies of both players and the value for every initial position are obvious, since a locally optimal move in it is optimal in the whole game. We show that this algorithm is pseudo-polynomial when the number of random nodes is constant. We also provide an almost matching lower bound on its running time, and show that this bound holds for a wider class of algorithms. Let us add that the general (non-ergodic) case is at least as hard as SSG′s, for which no pseudo-polynomial algorithm is known.