Age Identification of Twitter Users : Classification Methods and Sociolinguistic Analysis

Simaki, Vasiliki and Mporas, Iosif and Megalooikonomou, Vasileios (2016) Age Identification of Twitter Users : Classification Methods and Sociolinguistic Analysis. In: Computational Linguistics and Intelligent Text Processing : 17th International Conference, CICLing 2016, Konya, Turkey, April 3–9, 2016, Revised Selected Papers, Part II. Lecture Notes in Computer Science . Springer, Cham, pp. 385-395. ISBN 9783319754864

Full text not available from this repository.

Abstract

In this article, we address the problem of age identification of Twitter users, after their online text. We used a set of text mining, sociolinguistic-based and content-related text features, and we evaluated a number of well-known and widely used machine learning algorithms for classification, in order to examine their appropriateness on this task. The experimental results showed that Random Forest algorithm offered superior performance achieving accuracy equal to 61%. We ranked the classification features after their informativity, using the ReliefF algorithm, and we analyzed the results in terms of the sociolinguistic principles on age linguistic variation.

Item Type:
Contribution in Book/Report/Proceedings
ID Code:
124808
Deposited By:
Deposited On:
23 Apr 2018 12:58
Refereed?:
Yes
Published?:
Published
Last Modified:
16 Jul 2024 04:17