Arreerard, Ratchakrit and Senivongse, Twittie (2018) Thai Defamatory Text Classification on Social Media. In: Proceedings of the 2018 IEEE International Conference on Big Data, Cloud Computing, Data Science Engineering : Proceedings of the BCD2018. UNSPECIFIED, Yonago, Japan, pp. 73-78. ISBN 978-1-5386-5606-8
Full text not available from this repository.Abstract
Development of social media has brought a huge change to social communities in several aspects. They offer a place where social media users can post information, express opinions, and share interests. However, some information and opinions may cause a negative impact on the person mentioned in the post and that person can become a target of defamation. In Thailand, although defaming someone on social media is illegal, most social media users are not aware of it. To raise awareness of this issue, this paper proposes the classification of defamatory text in Thai language. Several approaches to text classification are used to analyze textual comments to political news and articles on Facebook, including word n-grams, character ngrams, specific terms, grammatical dependency structure, and sentiment polarity. The experiment is conducted using two machine learning methods with several combination of the approaches. The result shows that SVM performed better than Naïve Bayes, and word n-grams and character n-grams are more efficient than other approaches with F score of 0.64 and accuracy of 0.74. In addition, dependency structure, specific terms, and sentiment polarity perform quite well with precision of 0.65 and accuracy of 0.66, but with lower recall rate of 0.35. We discuss linguistic variations in Thai language which affect the performance of the methods.