Annotator Disagreement-Based Analysis for Developing Bias Benchmark Datasets in Resource-Restricted Settings

Yogarajan, Vithya and Rayson, Paul and Dobbie, Gillian and Keesing, Aaron and Keegan, Te Taka and Benavides-Prado, Diana and Witbrock, Michael (2025) Annotator Disagreement-Based Analysis for Developing Bias Benchmark Datasets in Resource-Restricted Settings. In: International Conference on Neural Information Processing : (ICONIP 2024). Lecture Notes in Computer Science . Springer, Singapore, pp. 400-415. ISBN 9789819666027

[thumbnail of disagreement_paper]
Text (disagreement_paper)
disagreement_paper.pdf - Accepted Version
Available under License Creative Commons Attribution.

Download (20MB)

Abstract

Developing benchmark datasets to tackle the bias problem in large language models (LLMs) is difficult for mixed-ethnic, small, and/or indigenous societies with limited resources. Existing bias benchmark datasets reflect the societal makeup of resource-rich societies such as the US and Europe. A deficit in available annotated datasets, the lack of annotators, and relevant LLM-generated text limit the potential for research in developing debiasing techniques for resource-restricted settings. Practices such as discarding data instances with annotator disagreement or obtaining a majority label from many annotators with multiple iterations of annotations are not applicable in this setting because it could lead to discrimination. Rather than discarding the information from such annotations, we propose utilising annotator disagreement information through a multi-annotator ensemble approach to build bias benchmark datasets. We capture annotator information by obtaining soft labels, which provide probability distributions over the hard labels that are either manually annotated or from pre-trained models. Firstly, we use pre-trained language models as an alternative for scenarios where manual annotations are restricted and demonstrate such readily accessible models yield similar or better performance than baseline aggregated manual annotator labels. Secondly, we demonstrate that classifications using the multi-annotator ensemble approach perform better than the single-label trained classification model.

Item Type:
Contribution in Book/Report/Proceedings
Uncontrolled Keywords:
Research Output Funding/no_not_funded
Subjects:
?? no - not funded ??
ID Code:
236324
Deposited By:
Deposited On:
01 Apr 2026 09:10
Refereed?:
Yes
Published?:
Published
Last Modified:
01 Apr 2026 22:05