Ryan, Sean and Killick, Rebecca (2020) Detecting changepoints in multivariate data. PhD thesis, Lancaster University.
Abstract
In this thesis, we propose new methodology for detecting changepoints in multivariate data, focusing on the setting where the number of variables and the length of the data can be very large. We begin by considering the problem of detecting changepoints where only a sub- set of the variables are affected by the change. Previous work demonstrated that the changepoint locations and affected variables can be simultaneously estimated by solving a discrete optimisation problem. We propose two new methods PSMOP (Pruned Subset Multivariate Optimal Partitioning) and SPOT (Subset Partitioning Optimal Time) for solving this problem. PSMOP uses novel search space reduction techniques to efficiently compute an exact solution for data of moderate size. SPOT is an approximate method, which gives near optimal solutions at a very low computational cost, and can be applied to very large datasets. We use this new methodology to study changes in sales data due to the effect of promotions. We then examine the problem of detecting changes in the covariance structure of high dimensional data. Using results from Random Matrix Theory, we introduce a novel test statistic for detecting such changes. Importantly, under the null hypothesis of no change, the distribution of this test statistic is independent of the underlying covariance matrix. We utilise this test statistic to study changes in the amount of water on the surface of a plot of soil.