Pinder, Thomas and Leslie, David and Nemeth, Christopher and Young, Paul (2023) Developments in Gaussian processes with applications to climate science and network problems. PhD thesis, Lancaster University.
Abstract
The ability to efficiently model complex datasets using probabilistic models is a key component of many machine learning workflows as it offers the ability to extract accurate predictions and well-characterised uncertainty estimates. Consequently, it becomes possible to develop models that can be deployed as decision-making tools. However, evaluating such models is often computationally expensive, particularly when assumptions of independence and identically distributed data can no longer be made. This thesis explores how Gaussian process models can be used to model climate data, and how the kernel function of a Gaussian process can be adapted to operate on data observed on a network. Methodological developments are proposed to enable faster inference for Gaussian processes, to use Gaussian processes as tools for embedding hypergraphs, and to extract latent functions from vectorvalued datasets. In application, this thesis explores the effect of Covid-19 on air pollution in the United Kingdom, how air pollution varies at a street-level, the latent structure of political networks, and what future warmings can be expected on planet Earth. The consideration of how such models can be computationally developed is carefully considered throughout, with a specific chapter dedicated to the development of a new Gaussian process software package that allows for new computational methods to be developed and tested.