Stein, Anja and Leslie, David and Frigessi, Arnoldo (2023) Sequential Inference with the Mallows Model. PhD thesis, Lancaster University.
Abstract
The Mallows model is a widely used probabilistic model for analysing rank data. It assumes that a collection of n items can be ranked by each assessor and then summarised as a permutation of size n. The associated probability distribution is defined on the permutation space of these items. A hierarchical Bayesian framework for the Mallows model, named the Bayesian Mallows model, has been developed recently to perform inference and to provide uncertainty estimates of the model parameters. This framework typically uses Markov chain Monte Carlo (MCMC) methods to simulate from the target posterior distribution. However, MCMC can be considerably slow when additional computational effort is presented in the form of new ranking data. It can therefore be difficult to update the Bayesian Mallows model in real time. This thesis extends the Bayesian Mallows model to allow for sequential updates of its posterior estimates each time a collection of new preference data is observed. The posterior is updated over a sequence of discrete time steps with fixed computational complexity. This can be achieved using Sequential Monte Carlo (SMC) methods. SMC offers a standard alternative to MCMC by constructing a sequence of posterior distributions using a set of weighted samples. The samples are propagated via a combination of importance sampling, resampling and moving steps. We propose an SMC framework that can perform sequential updates for the posterior distribution for both a single Mallows model and a Mallows mixture each time we observe new full rankings in an online setting. We also construct a framework to conduct SMC with partial rankings for a single Mallows model. We propose an alternative proposal distribution for data augmentation in partial rankings that incorporates the current posterior estimates of the Mallows model parameters in each SMC iteration. We also extend the framework to consider how the posterior is updated when known assessors provide additional information in their partial ranking. We show how these corrections in the latent information are performed to account for the changes in the posterior.