Erschienen in:

2003 | OriginalPaper | Buchkapitel

Lower Bounds on the Sample Complexity of Exploration in the Multi-armed Bandit Problem

verfasst von : Shie Mannor, John N. Tsitsiklis

Erschienen in: Learning Theory and Kernel Machines

Verlag: Springer Berlin Heidelberg

Enthalten in: Professional Book Archive

Zugang erhalten

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

We consider the Multi-armed bandit problem under the PAC (“probably approximately correct”) model. It was shown by Even-Dar et al. [5] that given n arms, it suffices to play the arms a total of$O\big(({n}/{\epsilon^2})\log ({1}/{\delta})\big)$ times to find an ε-optimal arm with probability of at least 1-δ. Our contribution is a matching lower bound that holds for any sampling policy. We also generalize the lower bound to a Bayesian setting, and to the case where the statistics of the arms are known but the identities of the arms are not.

Springer Professional