Regret Analysis of Stochastic and Nonstochastic Multi-Armed Bandit Problems

Regret Analysis of Stochastic and Nonstochastic Multi-Armed Bandit Problems PDF Author: Sébastien Bubeck
Publisher:
ISBN: 9781601986276
Category : Artificial intelligence
Languages : en
Pages : 137

Get Book Here

Book Description
Multi-armed bandit problems are the most basic examples of sequential decision problems with an exploration-exploitation trade-off. This is the balance between staying with the option that gave highest payoffs in the past and exploring new options that might give higher payoffs in the future. In this monograph, the focus is on two extreme cases in which the analysis of regret is particularly simple and elegant: independent and identically distributed payoffs and adversarial payoffs. Besides the basic setting of finitely many actions, it also analyzes some of the most important variants and extensions, such as the contextual bandit model.