Asymptotic bayes analysis for the finite-horizon one-armed-bandit problem

Apostolos N. Burnetas, Michael N. Katehakis

Research output: Contribution to journalArticlepeer-review

8 Scopus citations

Abstract

The multiarmed-bandit problem is often taken as a basic model for the trade-off between the exploration and utilization required for efficient optimization under uncertainty. In this article, we study the situation in which the unknown performance of a new bandit is to be evaluated and compared with that of a known one over a finite horizon. We assume that the bandits represent random variables with distributions from the one-parameter exponential family. When the objective is to maximize the Bayes expected sum of outcomes over a finite horizon, it is shown that optimal policies tend to simple limits when the length of the horizon is large.

Original languageEnglish (US)
Pages (from-to)53-82
Number of pages30
JournalProbability in the Engineering and Informational Sciences
Volume17
Issue number1
DOIs
StatePublished - Jan 1 2003

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Statistics, Probability and Uncertainty
  • Management Science and Operations Research
  • Industrial and Manufacturing Engineering

Fingerprint Dive into the research topics of 'Asymptotic bayes analysis for the finite-horizon one-armed-bandit problem'. Together they form a unique fingerprint.

Cite this