Li, Owen and Killick, Rebecca (2025) Novel methods for optimal multiple changepoint estimation. PhD thesis, Lancaster University.
2025OwenLiPhd.pdf - Published Version
Restricted to Repository staff only until 21 July 2026.
Available under License Creative Commons Attribution-NonCommercial.
Download (49MB)
Abstract
This thesis develops novel methodologies across three aspects of changepoint detection: exact search methods, uncertainty statements for changepoints, and detecting changes in the online setting. Motivated by a digital health application where the goal is to identify early signs of an individual's health decline using activity data, the first part of the thesis details a novel changepoint search method, denoted SNcirc, specfically designed to utilise the periodic nature of a data process by considering the time axis as circular rather than linear.\ SNcirc models periodic behaviour and has a computational cost where the total length of the data is not the dominating term, making it efficient in real-world applications. Furthermore, we extend SNcirc to detect long-term changes to periodic behaviour by combining with the pruned exact linear time (PELT) algorithm. We demonstrate these methods are time efficient and statistically optimal in simulation studies and in our digital health application. The second part of the thesis explores quantifying uncertainty in changepoint estimates. The changepoint detection problem seeks to identify two unknowns:\ the number and location of changepoints within a data set.\ Alongside parameter estimation, it is important to also provide a measure of uncertainty in the estimates. Uncertainty in changepoint detection is an under-studied topic. Existing methods typically condition on the changepoint selection step in order to provide post-selection confidence intervals for the estimated changepoint locations. In this thesis, we extend the intuition behind the single changepoint confidence set to multiple changes, proving it's validity as a confidence set for both the number and location of changepoints. Whilst the theoretical approach is clear, it's calculation in practice is non-trivial. We build upon optimal multiple changepoint search algorithms to demonstrate how the confidence set can be calculated efficiently during the changepoint identification step. We compare our approach to existing changepoint uncertainty methods through extensive simulation studies, demonstrating we can capture the uncertainty in the number {\bf{and}} location of the estimated changepoints. This is crucial in characterising the uncertainty associated with potential missed changepoints. The third part of the thesis considers detecting multiple changepoints in the online setting, where data is not assumed to have been fully observed but is arriving constantly and changepoints need to be identified in real-time. Current methods in the online changepoint literature only look for a single changepoint and if multiple changepoints exist, the method resets in order to find the next changepoint, disregarding the past data. This degrades detection accuracy as any errors made in the previous detections affect the estimation of future changepoints. We introduce a new online algorithm for multiple changepoints drawing inspiration from recent efficient offline changepoint detection algorithms. This method is truly a multiple changepoint algorithm, not only flagging up new changepoints as soon as possible but also able to re-assess past changepoints to provide more accurate solutions in real-time.