An Online Incremental Learning Approach for Configuring Multi-arm Bandits Algorithms

Alsomali, Mohammad and Rodrigues-Filho, Roberto and Soriano Marcolino, Leandro and Porter, Barry (2024) An Online Incremental Learning Approach for Configuring Multi-arm Bandits Algorithms. In: ECAI : European Conference On Artificial Intelligence. UNSPECIFIED. (In Press)

Text (m2033)
m2033.pdf - Accepted Version
Available under License Creative Commons Attribution-NonCommercial.
Download (1MB)

Abstract

This paper introduces Dynamic Bayesian Optimisation for Multi-Arm Bandits (DBO-MAB), an algorithm that dynamically adapts hyperparameters of multi-arm bandit algorithms using incremental Bayesian optimisation. DBO-MAB addresses the challenge of tuning hyperparameters in uncertain and dynamic environments, particularly for applications like web server optimisation. It uses a dynamic range adjustment approach based on the interquartile mean (IQM) of observed rewards to focus the search space on promising regions. Evaluated across diverse static and dynamic environments, DBO-MAB outperforms state-of-the-art algorithms such as Bootstrapped UCB and f-Discounted-Sliding-Window Thompson Sampling, reducing average response time by ≈ 55%.

Item Type:

Contribution in Book/Report/Proceedings

Uncontrolled Keywords:

Research Output Funding/no_not_funded

Subjects:

?? no - not fundedno ??

Departments:

Faculty of Science and Technology > School of Computing & Communications

ID Code:

224190

Deposited By:

ep_importer_pure

Deposited On:

04 Oct 2024 13:40

Refereed?:

Yes

Published?:

In Press

Last Modified:

19 Jun 2025 23:18

URI:

https://eprints.lancs.ac.uk/id/eprint/224190