MentalHelp : A Multi-Task Dataset for Mental Health in Social Media

Raihan, Md Nishat and Puspo, Sadiya Sayara Chowdhury and Farabi, Shafkat and Bucur, Ana-Maria and Ranasinghe, Tharindu and Zampieri, Marcos (2024) MentalHelp : A Multi-Task Dataset for Mental Health in Social Media. In: Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024) :. ELRA and ICCL, ITA, pp. 11196-11203. ISBN 9782493814104

[thumbnail of 2024.lrec-main.977]
Text (2024.lrec-main.977)
2024.lrec-main.977.pdf - Published Version
Available under License Creative Commons Attribution-NonCommercial.

Download (259kB)

Abstract

Early detection of mental health disorders is an essential step in treating and preventing mental health conditions. Computational approaches have been applied to users’ social media profiles in an attempt to identify various mental health conditions such as depression, PTSD, schizophrenia, and eating disorders. The interest in this topic has motivated the creation of various depression detection datasets. However, annotating such datasets is expensive and time-consuming, limiting their size and scope. To overcome this limitation, we present MentalHelp, a large-scale semi-supervised mental disorder detection dataset containing 14 million instances. The corpus was collected from Reddit and labeled in a semi-supervised way using an ensemble of three separate models - flan-T5, Disor-BERT, and Mental-BERT.

Item Type:
Contribution in Book/Report/Proceedings
ID Code:
221621
Deposited By:
Deposited On:
20 Nov 2024 14:50
Refereed?:
Yes
Published?:
Published
Last Modified:
20 Nov 2024 14:50