Shardlow, Matthew and Alva-Manchego, Fernando and Batista-Navarro, Riza Theresa and Bott, Stefan and Calderon Ramirez, Saul and Cardon, Rémi and François, Thomas and Hayakawa, Akio and Horbach, Andrea and Hülsing, Anna and Imperia, Joseph Marvin and Nohej, Adam and Ide, Yusuke and North, Kai and Occhipinti, Laura and Rojas, Nelson Peréz and Raihan, Md Nishat and Ranasinghe, Tharindu and Salazar, Martin Solis and Štajner, Sanja and Zampieri, Marcos and Saggion, Horacio (2024) The BEA 2024 Shared Task on the Multilingual Lexical Simplification Pipeline. In: Proceedings of the 19th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2024) :. Association for Computational Linguistics, Kerrville, pp. 571-589. ISBN 9798891761001
2024.bea-1.51.pdf - Published Version
Available under License Creative Commons Attribution.
Download (223kB)
Abstract
We report the findings of the 2024 Multilingual Lexical Simplification Pipeline shared task. We released a new dataset comprising 5,927 instances of lexical complexity prediction and lexical simplification on common contexts across 10 languages, split into trial (300) and test (5,627). 10 teams participated across 2 tracks and 10 languages with 233 runs evaluated across all systems. Five teams participated in all languages for the lexical complexity prediction task and 4 teams participated in all languages for the lexical simplification task. Teams employed a range of strategies, making use of open and closed source large language models for lexical simplification, as well as feature-based approaches for lexical complexity prediction. The highest scoring team on the combined multilingual data was able to obtain a Pearson’s correlation of 0.6241 and an ACC@1@Top1 of 0.3772, both demonstrating that there is still room for improvement on two difficult sub-tasks of the lexical simplification pipeline.