Collective self-understanding : A linguistic style analysis of naturally occurring text data

Cork, A. and Everson, R. and Naserian, E. and Levine, M. and Koschate-Reis, M. (2023) Collective self-understanding : A linguistic style analysis of naturally occurring text data. Behavior Research Methods, 55 (8). pp. 4455-4477. ISSN 1554-351X

Full text not available from this repository.


Understanding what groups stand for is integral to a diverse array of social processes, ranging from understanding political conflicts to organisational behaviour to promoting public health behaviours. Traditionally, researchers rely on self-report methods such as interviews and surveys to assess groups' collective self-understandings. Here, we demonstrate the value of using naturally occurring online textual data to map the similarities and differences between real-world groups' collective self-understandings. We use machine learning algorithms to assess similarities between 15 diverse online groups' linguistic style, and then use multidimensional scaling to map the groups in two-dimensonal space (N=1,779,098 Reddit comments). We then use agglomerative and k-means clustering techniques to assess how the 15 groups cluster, finding there are four behaviourally distinct group types - vocational, collective action (comprising political and ethnic/religious identities), relational and stigmatised groups, with stigmatised groups having a less distinctive behavioural profile than the other group types. Study 2 is a secondary data analysis where we find strong relationships between the coordinates of each group in multidimensional space and the groups' values. In Study 3, we demonstrate how this approach can be used to track the development of groups' collective self-understandings over time. Using transgender Reddit data (N= 1,095,620 comments) as a proof-of-concept, we track the gradual politicisation of the transgender group over the past decade. The automaticity of this methodology renders it advantageous for monitoring multiple online groups simultaneously. This approach has implications for both governmental agencies and social researchers more generally. Future research avenues and applications are discussed.

Item Type:
Journal Article
Journal or Publication Title:
Behavior Research Methods
?? humanslinguistics ??
ID Code:
Deposited By:
Deposited On:
14 Dec 2022 12:10
Last Modified:
18 Dec 2023 09:40