Improving on State-of-the-Art Models for Sentiment Analysis on Saudi-English Code-Switching Text

Alghamdi, Samaher and Rayson, Paul and Alotibi, Reem (2026) Improving on State-of-the-Art Models for Sentiment Analysis on Saudi-English Code-Switching Text. In: Proceedings of the 2nd Workshop on NLP for Languages Using Arabic Script :. Association for Computational Linguistics, Rabat, Morocco, pp. 218-228.

Full text not available from this repository.

Abstract

Inserting English words, phrases, or sentences while writing or speaking in the Saudi Arabic dialect has become a widespread phenomenon in Saudi society. This phenomenon is linguistically called code-switching. It remains unclear how current sentiment analysis methods perform on Saudi-English code-switching text. In this paper, we address this gap by conducting the first sentiment analysis study on Saudi-English code-switching text. We present the first Saudi-English Sentiment Analysis Code Switching Dataset (SESA-CSD) and establish baseline results on this dataset. By evaluating multiple state-of-the-art small language models, we achieve improvements over the baseline of 3% to 11% in both accuracy and macro-F1. Among all small language models, XLM-RoBERTa achieved the highest performance,with an accuracy of 95.50% and a macro-F1 of 95.53%. Our findings indicate that multilingual and Arabic small language models, such as XLM-RoBERTa, GigaBERT, and SaudiBERT, consistently outperform bilingual Arabic-English large language models, such as Fanar and ALLaM, across zero-shot and multiple few-shot settings.

Item Type:

Contribution in Book/Report/Proceedings

Departments:

Faculty of Science and Technology > School of Computing & Communications

ID Code:

236362

Deposited By:

ep_importer_pure

Deposited On:

01 Apr 2026 13:55

Refereed?:

Yes

Published?:

Published

Last Modified:

01 Apr 2026 22:05

URI:

https://eprints.lancs.ac.uk/id/eprint/236362