MUSTS: MUltilingual Semantic Textual Similarity Benchmark

Ranasinghe, Tharindu and Hettiarachchi, Hansi and Orasan, Constantin and Mitkov, Ruslan (2025) MUSTS: MUltilingual Semantic Textual Similarity Benchmark. In: Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) :. Association for Computational Linguistics, Vienna, pp. 331-353. ISBN 9798891762527

Full text not available from this repository.

Abstract

Predicting semantic textual similarity (STS) is a complex and ongoing challenge in natural language processing (NLP). Over the years, researchers have developed a variety of supervised and unsupervised approaches to calculate STS automatically. Additionally, various benchmarks, which include STS datasets, have been established to consistently evaluate and compare these STS methods. However, they largely focus on high-resource languages, mixed with datasets annotated focusing on relatedness instead of similarity and containing automatically translated instances. Therefore, no dedicated benchmark for multilingual STS exists. To solve this gap, we introduce the Multilingual Semantic Textual Similarity Benchmark (MUSTS), which spans 13 languages, including low-resource languages. By evaluating more than 25 models on MUSTS, we establish the most comprehensive benchmark of multilingual STS methods. Our findings confirm that STS remains a challenging task, particularly for low-resource languages.

Item Type:
Contribution in Book/Report/Proceedings
ID Code:
231818
Deposited By:
Deposited On:
18 Nov 2025 10:20
Refereed?:
Yes
Published?:
Published
Last Modified:
18 Nov 2025 23:15