Managing uncertainty in machine learning techniques : An investigation of adaptive sampling strategies through land cover mappings

Phillipson, Jordan and Blair, Gordon and Henrys, Peter (2024) Managing uncertainty in machine learning techniques : An investigation of adaptive sampling strategies through land cover mappings. PhD thesis, Lancaster University.

[thumbnail of 2024PhillipsonPhD]
Text (2024PhillipsonPhD)
2024Phillipson.pdf - Published Version

Download (5MB)

Abstract

In recent decades, the use of machine learning techniques in classification problems has become increasingly popular across a wide variety of domains. For users to have trust in such classifiers though, one must be able to reliably quantify uncertainty. A common way of quantifying uncertainty in classifiers is through reference sampling where a smaller set of ground-truths is sampled and compared to their predicted counterparts to make inferences about the precision and accuracy of classifiers using statistical methods. However, classification via machine learning can bring some additional challenges to uncertainty quantification, as machine learning techniques are often (i) trained using data that has not been sampled with formal statistical inference in mind; (ii) are often black-box when compared to traditional modelling. These issues are further compounded when sampling reference data under conditions suitable for uncertainty quantification is expensive. Here, users are often forced to make a compromise between the degree of uncertainty and the costs of reference sampling, even when the original classifier built using machine learning may be performing well. In short, when it comes to quantifying and reducing uncertainty, it is not just about how well the classifier performs. One must also be able to collect enough data sampled under the right conditions. This thesis explores how users may better manage the cost-benefit trade-offs of reference sampling when quantifying and reducing uncertainty in machine learning classifiers. Specifically, this thesis investigates how a framework for adaptively sampling reference data can be used to better manage uncertainty using two land cover mapping case studies to evaluate the proposed framework. With these case studies, the following problems are considered: (i) quantifying uncertainty in area estimation and mappings; (ii) proposing efficient sample designs under uncertainty; (iii) proposing sample designs when the cost of reference sampling varies across a mapped region.

Item Type:
Thesis (PhD)
ID Code:
221385
Deposited By:
Deposited On:
19 Jun 2024 12:35
Refereed?:
No
Published?:
Published
Last Modified:
14 Nov 2024 01:35