Simon, Benjamen and Jewell, Christopher and Neal, Peter (2024) Big Data Epidemics. PhD thesis, Lancaster University.
Abstract
Epidemic data inference is a key tool for the control and eradication of infectious disease spread. In the modern data age, where epidemic surveillance makes data abundant, the current methods of epidemic inference are no longer sufficient. Bovine Tuberculosis is endemic in the UK and affects tens of millions of cattle each year, with data available spanning decades (APHA, 2023c). There were 21 million confirmed cases of COVID-19 in England, from a population of roughly 56 million people, over a 3 year period (UK Health Security Agency, 2023). There are also around 1 billion cases of seasonal Influenza per year worldwide, resulting in up to 650, 000 deaths (World Health Organisation, 2023). The current gold- standard methods are incapable of making timely and efficient inference on big data epidemics at the individual level. In this thesis we introduce novel methodology that uses discrete-time population-aggregated approximations of epidemic data to make accurate and efficient inference for complex large-scale epidemics, whilst vastly reducing the computational burden. We apply these methods to a case study of Bovine Tuberculosis in England and Wales, including a novel method of incorporating movement data. We believe the methods developed in this thesis could form part of a multi-pronged approach for understanding and combating epidemics and pandemics of the scale we are now experiencing.