Page, Stephen and Grunewalder, Steffen and Pavlidis, Nicos and Leslie, David (2019) Reproducing-Kernel Hilbert space regression with notes on the Wasserstein Distance. PhD thesis, Lancaster University.
2019pagephd.pdf - Published Version
Available under License Creative Commons Attribution Non-commercial No Derivatives.
Download (945kB)
Abstract
We study kernel least-squares estimators for the regression problem subject to a norm constraint. We bound the squared L2 error of our estimators with respect to the covariate distribution. We also bound the worst-case squared L2 error of our estimators with respect to a Wasserstein ball of probability measures centred at the covariate distribution. This leads us to investigate the extreme points of Wasserstein balls. In Chapter 3, we provide bounds on our estimators both when the regression function is unbounded and when the regression function is bounded. When the regression function is bounded, we clip the estimators so that they are closer to the regression function. In this setting, we also use training and validation to adaptively select a size for our norm constraint based on the data. In Chapter 4, we study a different adaptive estimation procedure called the Goldenshluger--Lepski method. Unlike training and validation, this method uses all of the data to create estimators for a range of sizes of norm constraint before using pairwise comparisons to select a final estimator. We are able to adaptively select both a size for our norm constraint and a kernel. In Chapter 5, we examine the extreme points of Wasserstein balls. We show that the only extreme points which are not on the surface of the ball are the Dirac measures. This is followed by finding conditions under which points on the surface of the ball are extreme points or not extreme points. In Chapter 6, we provide bounds on the worst-case squared L2 error of our estimators with respect to a Wasserstein ball of probability measures centred at the covariate distribution. We prove bounds both when the regression function is unbounded and when the regression function is bounded. We also investigate the analysis and computation of alternative estimators.