Transformer Models for Offensive Language Identification in Marathi

Nene, Mayuresh and North, Kai and Ranasinghe, Tharindu and Zampieri, Marcos (2021) Transformer Models for Offensive Language Identification in Marathi. In: Forum for Information Retrieval Evaluation (working notes) :. CEUR Workshop Proceedings . CEUR Workshop Proceedings, pp. 273-282.

[thumbnail of T1-28]
Text (T1-28)
T1-28.pdf - Published Version

Download (254kB)

Abstract

This paper describes the WLV-RIT entry to the Hate Speech and Offensive Content Identification in English and Indo-Aryan Languages (HASOC) shared task of 2021. The HASOC 2021 organizers provided participants with annotated datasets containing social media posts of English, Hindi and Marathi. We participated in Marathi Subtask 1A: identifying hateful, offensive and profane content. In our methodology, we take advantage of available data from high resource languages by applying cross-lingual transformer-based models and transfer learning to make predictions to Marathi data. Our system achieved a macro F1 score of 0.91 for the test set and it ranked 1 st place out of 25 systems.

Item Type:
Contribution in Book/Report/Proceedings
ID Code:
224497
Deposited By:
Deposited On:
15 Apr 2025 13:20
Refereed?:
Yes
Published?:
Published
Last Modified:
20 Apr 2025 00:08