Filtered Poisson Process Bandit on a Continuum

Grant, James A. and Szechtman, Roberto (2020) Filtered Poisson Process Bandit on a Continuum. arXiv.org.

Text (2007.09966v1)
2007.09966v1.pdf
Download (5MB)

Abstract

We consider a version of the continuum armed bandit where an action induces a filtered realisation of a non-homogeneous Poisson process. Point data in the filtered sample are then revealed to the decision-maker, whose reward is the total number of revealed points. Using knowledge of the function governing the filtering, but without knowledge of the Poisson intensity function, the decision-maker seeks to maximise the expected number of revealed points over T rounds. We propose an upper confidence bound algorithm for this problem utilising data-adaptive discretisation of the action space. This approach enjoys O(T^(2/3)) regret under a Lipschitz assumption on the reward function. We provide lower bounds on the regret of any algorithm for the problem, via new lower bounds for related finite-armed bandits, and show that the orders of the upper and lower bounds match up to a logarithmic factor.

Item Type:

Journal Article

Journal or Publication Title:

arXiv.org

Subjects:

?? cs.lgstat.ml ??

Departments:

Faculty of Science and Technology > Mathematics and Statistics
Lancaster University Management School > Management Science

ID Code:

147409

Deposited By:

ep_importer_pure

Deposited On:

12 Oct 2020 15:30

Refereed?:

Published?:

Published

Last Modified:

11 Dec 2025 05:33

URI:

https://eprints.lancs.ac.uk/id/eprint/147409