Spearing, Jess and Tawn, Jonathan and Paulden, Tim and Irons, David and Bennett, Grace (2021) Ranking, and other properties, of elite swimmers using extreme value theory. Journal of the Royal Statistical Society: Series A Statistics in Society, 184 (1). pp. 368-395. ISSN 0964-1998
Spearing_Tawn_RankingEliteSwimmers_21_05_2020.pdf - Accepted Version
Available under License Creative Commons Attribution.
Download (954kB)
Abstract
The International Swimming Federation (FINA) uses a very simple points system with the aim to rank swimmers across all swimming events. The points acquired is a function of the ratio of the recorded time and the current world record for that event. With some world records considered ‘better’ than others however, bias is introduced between events, with some being much harder to attain points where the world record is hard to beat. A model based on extreme value theory is introduced, where swim times are modelled through their rate of occurrence, and with the distribution of the best times following a generalised Pareto distribution. Within this framework, the strength of a particular swim is judged based on its position compared to the whole distribution of swim times, rather than just the world record. This model also accounts for the date of the swim, as training methods improve over the years, as well as changes in technology, such as full body suits. The parameters of the generalised Pareto distribution, for each of the 34 individual long course events, will be shown to vary with covariates, leading to a novel single unified description of swim quality over all events and time. This structure, which allows information to be shared across all strokes, distances, and genders, improves the predictive power as well as the model robustness compared to equivalent independent models. A by-product of the model is that it is possible to estimate other features of interest, such as the ultimate possible time, the distribution of new world records for any event, and to correct swim times for the effect of full body suits. The methods will be illustrated using a dataset of the best 500 swim times for each event in the period 2001–2018.