Marsden, Alan (2022) Reliability and Validity of Research with Corpora of Music. In: The Oxford Handbook of Music and Corpus Studies :. Oxford University Press, Oxford, C7.P1–C7.S11. ISBN 9780190945442
Full text not available from this repository.Abstract
Corpus musicologists seek conclusions which are valid and applicable for demonstrably abstract reasons. This chapter examines several aspects of reliability and validity in corpus musicology: the avoidance of bias, particularly through the deliberate exclusion of expert judgment; issues of the reliability of methods of measurement used and conversion from one form of information to another; variances in interpreted data involving human judgments; errors in corpora; the notions of “random” or “representative” samples; and statistical significance in both hypothesis-testing and exploratory studies. Some research aims to extend or test theory through building models, now frequently employing machine-learning. Particular care is required to separate training and test materials to avoid over-fitting, i.e., building a model which works for the specific data used but not for other music. Even in the absence of machine-learning, continually working with the same or a small number of corpora has similar dangers of drawing conclusions which lack wider applicability. When working with several separate corpora is not possible, techniques such as bootstrapping can be employed to estimate the reliability of conclusions. Even working with corpora which are comprehensive, such as the entire output of a composer, is not problem-free. As corpora become ever more comprehensive and corpus musicology more common, musicology is likely to have to rediscover its links with composition and its concerns with not-yet-existent music.