Arabzadeh, Ali and Grant, James A. and Leslie, David (2025) Robust and personalised online learning. PhD thesis, Lancaster University.
2025ArabzadehPhD.pdf - Published Version
Restricted to Repository staff only until 1 June 2027.
Available under License Creative Commons Attribution.
Download (1MB)
Abstract
Over the past decade, multi-armed bandits have attracted significant attention from the online learning and machine learning communities, owing to their broad applicability in both theory and practice. From recommendation systems to adaptive control problems, the bandit framework effectively tackles the exploration-exploitation trade-off by modelling sequential decision making problems in a mathematically tractable manner. Meanwhile, the rapid growth of data through smartphones, edge devices, and networked sensors has created an urgent need for private and decentralised solutions. Federated learning meets this need by enabling collaborative model training without centralising raw user data, thus preserving user privacy and mitigating risks associated with data transfer. In this thesis, we integrate advanced online learning solutions into a federated environment, focusing on the X-armed bandit problem, a generalisation of multi-armed bandit to continuous action spaces. We present two major lines of work: one addressing the personalisation challenge by adapting to heterogeneous user distributions, and another ensuring robustness when facing corrupted or adversarial clients. Our solution employs an optimistic, phase-based approach that enhances efficiency, supported by confidence bounds that guarantee reliable performance. Beyond the federated setting, we also introduce a \emph{corruption-robust} solution for the centralised version of X-armed bandits, providing theoretical guarantees on performance under adversarial perturbations. Rigorous theoretical analyses confirm the effectiveness of our methods and also offer insights into robust, privacy-aware sequential decision-making in distributed environments.