Discounted multi-armed bandit problems on a collection of machines with varying speeds

Glazebrook, Kevin and Dunn, R. T. (2004) Discounted multi-armed bandit problems on a collection of machines with varying speeds. Mathematics of Operations Research, 29 (2). pp. 266-279. ISSN 0364-765X

Full text not available from this repository.

Abstract

This paper is the first to consider general multiarmed bandit problems on parallel machines working at different speeds. Block allocation policies make a once-for-all allocation of bandits to machines at time zero. In this class we describe how to achieve Blackwell optimality under given conditions. The block allocation policy identified allocates the bandits with the largest guaranteed reward rates to the machines operating at greatest speed. This policy is shown to be average-reward optimal in the class of general (nonanticipative, nonidling) policies.

Item Type:

Journal Article

Journal or Publication Title:

Mathematics of Operations Research

Additional Information:

RAE_import_type : Journal article RAE_uoa_type : Statistics and Operational Research

Uncontrolled Keywords:

/dk/atira/pure/subjectarea/asjc/1800/1803

Subjects:

?? average reward optimalityblackwell optimalitygittins indexmultiarmed banditsensitive discount optimalitymanagement science and operations researchgeneral mathematicscomputer science applicationsmathematics(all)qa mathematicsdiscipline-based research ??

Departments:

Faculty of Science and Technology > Mathematics and Statistics

ID Code:

2424

Deposited By:

ep_importer

Deposited On:

31 Mar 2008 12:08

Refereed?:

Yes

Published?:

Published

Last Modified:

16 Jul 2024 08:27

URI:

https://eprints.lancs.ac.uk/id/eprint/2424