An Online Incremental Learning Approach for Configuring Multi-arm Bandits Algorithms

Alsomali, Mohammad and Rodrigues-Filho, Roberto and Soriano Marcolino, Leandro and Porter, Barry (2024) An Online Incremental Learning Approach for Configuring Multi-arm Bandits Algorithms. In: ECAI : European Conference On Artificial Intelligence. UNSPECIFIED. (In Press)

[thumbnail of m2033]
Text (m2033)
m2033.pdf - Accepted Version
Available under License Creative Commons Attribution-NonCommercial.

Download (1MB)

Abstract

This paper introduces Dynamic Bayesian Optimisation for Multi-Arm Bandits (DBO-MAB), an algorithm that dynamically adapts hyperparameters of multi-arm bandit algorithms using incremental Bayesian optimisation. DBO-MAB addresses the challenge of tuning hyperparameters in uncertain and dynamic environments, particularly for applications like web server optimisation. It uses a dynamic range adjustment approach based on the interquartile mean (IQM) of observed rewards to focus the search space on promising regions. Evaluated across diverse static and dynamic environments, DBO-MAB outperforms state-of-the-art algorithms such as Bootstrapped UCB and f-Discounted-Sliding-Window Thompson Sampling, reducing average response time by ≈ 55%.

Item Type:
Contribution in Book/Report/Proceedings
Uncontrolled Keywords:
Research Output Funding/no_not_funded
Subjects:
?? no - not fundedno ??
ID Code:
224190
Deposited By:
Deposited On:
04 Oct 2024 13:40
Refereed?:
Yes
Published?:
In Press
Last Modified:
13 Nov 2024 01:45