We give polynomial time algorithms for quantitative (and qualitative)
Branching Markov Decision Processes
(BMDPs). Specifically, given a BMDP, and given an initial population, where the objective of the controller is to maximize (or minimize) the probability of eventually reaching a population that contains an object of a desired (or undesired) type, we give algorithms for approximating the supremum (infimum) reachability probability, within desired precision
$$\epsilon > 0$$
, in time polynomial in the encoding size of the BMDP and in
$$\log (1/\epsilon )$$
. We furthermore give P-time algorithms for computing
-optimal strategies for both maximization and minimization of reachability probabilities. We also give P-time algorithms for all associated
analysis problems, namely: deciding whether the optimal (supremum or infimum) reachability probabilities are 0 or 1. Prior to this paper, approximation of optimal reachability probabilities for BMDPs was not even known to be decidable.
Our algorithms exploit the following basic fact: we show that for any BMDP, its maximum (minimum)
-reachability probabilities are given by the
greatest fixed point
$$g^* \in [0,1]^n$$
of a corresponding monotone max (min) Probabilistic Polynomial System of equations (max/min-PPS),
, which are the Bellman optimality equations for a BMDP with non-reachability objectives. We show how to compute the GFP of max/min PPSs to desired precision in P-time.