In this paper, we study the two-player generalized cops and robbers (GCR) games introduced by Bonato and MacGillivray. Our main goals are to provide: (a) a game-theoretic formulation of GCR and (b) a self-contained game-theoretic proof that GCR has a value and an optimal strategy profile. To achieve our goals, we first formulate GCR (and CR as a special case) as a zero-sum stochastic game. Then we study a Vertex Labeling (VL) algorithm and prove it computes the value of the GCR game (for every starting “condition”) and that the vertex labels can be used to specify a positional deterministic optimal strategy for each player. We also compare our game-theoretic analysis to some of the usual graph theoretic/combinatorial approaches to CR/GCR.
Let us add that the following remarks apply (to a certain extent) even to the simpler, original version of CR/GCR which is concerned with winning (i.e., capturing/evading) rather than time optimal strategies.
Note that in the usual CR description time is counted in rounds, where each round encompasses (in our terminology) one cop and one robber turn. The two approaches are essentially equivalent.
Let us stress that they still deal with a two-player game: there is a single cop player and a single robber player, but each can control one or more (cop or robber) tokens.
There is one caveat: it is never specified in [5] whether once a state achieves a finite label can be subsequently relabeled (it should not); this is probably an oversight in the description.