Skip to main content

2003 | OriginalPaper | Buchkapitel

Lower Bounds on the Sample Complexity of Exploration in the Multi-armed Bandit Problem

verfasst von : Shie Mannor, John N. Tsitsiklis

Erschienen in: Learning Theory and Kernel Machines

Verlag: Springer Berlin Heidelberg

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

We consider the Multi-armed bandit problem under the PAC (“probably approximately correct”) model. It was shown by Even-Dar et al. [5] that given n arms, it suffices to play the arms a total of$O\big(({n}/{\epsilon^2})\log ({1}/{\delta})\big)$ times to find an ε-optimal arm with probability of at least 1-δ. Our contribution is a matching lower bound that holds for any sampling policy. We also generalize the lower bound to a Bayesian setting, and to the case where the statistics of the arms are known but the identities of the arms are not.

Metadaten
Titel
Lower Bounds on the Sample Complexity of Exploration in the Multi-armed Bandit Problem
verfasst von
Shie Mannor
John N. Tsitsiklis
Copyright-Jahr
2003
Verlag
Springer Berlin Heidelberg
DOI
https://doi.org/10.1007/978-3-540-45167-9_31