Yazidi, Anis and Hammer, Hugo and Leslie, David (2024) A Two-Timescale Learning Automata Solution to the Nonlinear Stochastic Proportional Polling Problem. IEEE Transactions on Systems, Man, and Cybernetics: Systems. ISSN 2168-2216
Polling_LA_IEEE_Papadimitriou.pdf - Accepted Version
Available under License Creative Commons Attribution.
Download (832kB)
Abstract
In this article, we introduce a novel learning automata (LA) solution to the nonlinear stochastic proportional polling (NSPP) problem. The only available solution to this problem in the literature is that given by Nicopolitidis et al. (2003), Obaidat et al. (2002), and Papadimitriou et al. (2002). It was shown to solve a large set of the adaptive resource allocation problems under noisy environments (Nicopolitidis et al., 2003; Obaidat et al., 2002; Papadimitriou and Pomportsis, 2000 and 1999; Nicopolitidis et al., 2004; Obaidat et al., 2001; and Papadimitriou and Pomportsis, 2000). We make a threefold contribution. First, we take a two-timescale approach to the field of LA by estimating the reward probabilities on a faster timescale than the timescale for updating the polling probabilities. Second, by making a not-obvious choice of the objective function, we show that the NSPP problem is indeed an instantiation of the stochastic nonlinear fractional equality knapsack (NFEK) problem, which is a substantial resource allocation problem based on the incomplete and noisy information (Granmo and Oommen, 2010). Third, in contrast to the legacy approach taken by Papadimitriou and Maritsas (1992 and 1996), we show through the extensive experimental results that our solution is remarkably robust to the choice of tuning parameters and that it outperforms the state of the art solution in terms of the Bayesian expected loss.