Filtered Poisson process bandit on a continuum

Grant, James and Szechtman, Roberto (2021) Filtered Poisson process bandit on a continuum. European Journal of Operational Research, 295 (2). pp. 575-586. ISSN 0377-2217

Text (FPPBanditEJOR-9)
FPPBanditEJOR_9.pdf - Accepted Version
Available under License Creative Commons Attribution-NonCommercial-NoDerivs.
Download (4MB)

Abstract

We consider a version of the continuum armed bandit where an action induces a filtered realisation of a non-homogeneous Poisson process. Point data in the filtered sample are then revealed to the decision-maker, whose reward is the total number of revealed points. Using knowledge of the function governing the filtering, but without knowledge of the Poisson intensity function, the decision-maker seeks to maximise the expected number of revealed points over T rounds. We propose an upper confidence bound algorithm for this problem utilising data-adaptive discretisation of the action space. This approach enjoys \tilde{O}(T^(2/3)) regret under a Lipschitz assumption on the reward function. We provide lower bounds on the regret of any algorithm for the problem, via new lower bounds for related finite-armed bandits, and show that the orders of the upper and lower bounds match up to a logarithmic factor.

Item Type:

Journal Article

Journal or Publication Title:

European Journal of Operational Research

Uncontrolled Keywords:

/dk/atira/pure/subjectarea/asjc/2600/2611

Subjects:

?? applied probabilitypoisson processesmulti-armed banditmachine learningmodelling and simulationmanagement science and operations researchinformation systems and management ??

Departments:

Faculty of Science and Technology > Mathematics and Statistics

ID Code:

153070

Deposited By:

ep_importer_pure

Deposited On:

23 Mar 2021 13:30

Refereed?:

Yes

Published?:

Published

Last Modified:

29 Jun 2025 00:07

URI:

https://eprints.lancs.ac.uk/id/eprint/153070