André, Lídia and Wadsworth, Jennifer (2025) Modelling and inference for the body and tail regions of multivariate data. PhD thesis, Lancaster University.
Abstract
When an accurate representation of multivariate data is required across both the body (described by non-extreme observations) and the tail (defined by the extreme observations) regions, it is crucial to have a model that is able to characterise the joint behaviour across both regions. In this thesis, we focus on developing dependence models that represent the entire distribution without the need to explicitly define each region. We propose two dependence models that fit the body and tail regions. For the first model, we construct the copula from a mixture of two copulas that are defined on the whole support of the data, and blended through a dynamic increasing weighting function. In this way, we give more weight to a copula tailored to the body for lower values, and more weight to a copula tailored to the tail for larger values. This ensures that there is a smooth transition between the two regions. For the second model, we construct a copula model based on a standard mixture of Gaussian distributions. As opposed to the first model, we avoid choosing a priori which copula families to include in the model, and are only required to determine the number of mixture components in the model. Moreover, we show that it scales relatively well to dimensions beyond the bivariate case. For both models, we derive (sub-)asymptotic dependence properties for specific model configurations, and show that they are flexible in capturing a broad range of extremal dependence structures through simulation studies. Motivated by the computational resources required to evaluate the likelihood function of the proposed models, we also explore likelihood-free approaches that use neural networks to perform inference. In particular, we assess the performance of neural Bayes estimators in estimating the model parameters, both for one of the models introduced for the joint body and tail, and further complex extremal dependence models. We also propose using neural networks as classifiers for model selection. In this way, we provide a toolbox for simple fitting and model selection of complex extremal dependence models. Methods to estimate extremal probabilities of complex environmental phenomena are presented; these result from participation in a challenge at the 2023 EVA conference. We propose using generalised additive models as well as a conditional extremes approach to estimate such quantities.