An investigation into the rater cognition of novice raters and the impact of cognitive attributes when assessing speaking

Eberharter, Kathrin and Harding, Luke (2021) An investigation into the rater cognition of novice raters and the impact of cognitive attributes when assessing speaking. PhD thesis, Lancaster University.

[thumbnail of 2021eberharterphd]
Text (2021eberharterphd)
2021eberharterphd.pdf - Published Version

Download (17MB)


Examinations of language proficiency routinely include the assessment of speaking, which still largely necessitates the use of human raters. However, variability in rating quality is a well-established phenomenon and makes rating a fundamental validity concern (Kane, 1992; 2006). Despite increased efforts to investigate rater cognition to better understand and mitigate rater effects (Bejar, 2012), research in language testing is yet to fully engage with the field of decision research (Baker, 2012; Purpura, 2013). Findings from this literature emphasize how complex decision tasks are shaped by factorssuch as processing capacities, perception, deliberate and automated thinking, and metacognitive control (Newell and Bröder, 2008). The purpose of this study was to investigate how novice raters use an analytic rating scale and to explore whether decision-making style, cognitive style, working memory capacity and executive function influence rating quality and rating behaviour. 39 pre-service English teachers rated a set of speaking performances (N=30) and completed two psychological questionnaires as well as a battery of cognitive tests. Rating behaviours were captured through JavaScript embedded in the online rating form. Data analysis first established measures of rating quality and scale use through a series of Many-Facets Rasch Measurement (MFRM) analyses. Next, relationships between individual attributes and measures of rater quality and behaviour were explored in a series of correlational analyses. Finally, the handwritten notes and self-report data from four selected raters were accumulated and explored to further enhance understanding of the rating process. Findings showed that there were considerable individual differences among the raters regarding rating quality and behaviours. Of all the variables included, decision-making style displayed the strongest associations with rating quality and behaviour, suggesting a relationship between intuitive and flexible processing and more successful rating. The four case studies highlighted a need to address cognitive load and directing of attention in rater training for speaking assessment.

Item Type:
Thesis (PhD)
ID Code:
Deposited By:
Deposited On:
22 Jul 2021 08:50
Last Modified:
16 Jul 2024 05:55