Transformer Models for Offensive Language Identification in Marathi

Nene, Mayuresh and North, Kai and Ranasinghe, Tharindu and Zampieri, Marcos (2021) Transformer Models for Offensive Language Identification in Marathi. In: Forum for Information Retrieval Evaluation (working notes) :. CEUR Workshop Proceedings . CEUR Workshop Proceedings, pp. 273-282.

Text (T1-28)
T1-28.pdf - Published Version
Download (254kB)

Abstract

This paper describes the WLV-RIT entry to the Hate Speech and Offensive Content Identification in English and Indo-Aryan Languages (HASOC) shared task of 2021. The HASOC 2021 organizers provided participants with annotated datasets containing social media posts of English, Hindi and Marathi. We participated in Marathi Subtask 1A: identifying hateful, offensive and profane content. In our methodology, we take advantage of available data from high resource languages by applying cross-lingual transformer-based models and transfer learning to make predictions to Marathi data. Our system achieved a macro F1 score of 0.91 for the test set and it ranked 1 st place out of 25 systems.

Item Type:

Contribution in Book/Report/Proceedings

Departments:

Faculty of Science and Technology > School of Computing & Communications

ID Code:

224497

Deposited By:

ep_importer_pure

Deposited On:

15 Apr 2025 13:20

Refereed?:

Yes