An Army of Me: Sockpuppets in Online Discussion Communities (1703.07355v1)

Published 21 Mar 2017 in cs.SI, cs.CY, physics.soc-ph, stat.AP, and stat.ML

Abstract: In online discussion communities, users can interact and share information and opinions on a wide variety of topics. However, some users may create multiple identities, or sockpuppets, and engage in undesired behavior by deceiving others or manipulating discussions. In this work, we study sockpuppetry across nine discussion communities, and show that sockpuppets differ from ordinary users in terms of their posting behavior, linguistic traits, as well as social network structure. Sockpuppets tend to start fewer discussions, write shorter posts, use more personal pronouns such as "I", and have more clustered ego-networks. Further, pairs of sockpuppets controlled by the same individual are more likely to interact on the same discussion at the same time than pairs of ordinary users. Our analysis suggests a taxonomy of deceptive behavior in discussion communities. Pairs of sockpuppets can vary in their deceptiveness, i.e., whether they pretend to be different users, or their supportiveness, i.e., if they support arguments of other sockpuppets controlled by the same user. We apply these findings to a series of prediction tasks, notably, to identify whether a pair of accounts belongs to the same underlying user or not. Altogether, this work presents a data-driven view of deception in online discussion communities and paves the way towards the automatic detection of sockpuppets.

Citations (164)

View on Semantic Scholar

Collections

Summary

Understanding Sockpuppetry in Online Discussion Communities

The paper "An Army of Me: Sockpuppets in Online Discussion Communities" presents a comprehensive analysis of sockpuppet usage across nine diverse online discussion platforms. Sockpuppets, defined as multiple user accounts controlled by an individual called a puppetmaster, are a phenomenon that raises concerns due to their potential for deceptive and manipulative behavior. The authors examine the distinctive features of sockpuppet behavior using empirical data and advanced analytical methods to better understand their prevalence and operational dynamics within these communities.

Key Findings and Methodology

The paper begins by establishing a robust method for identifying sockpuppets. It focuses on those accounts that post from the same IP address in the same discussion threads within a short timeframe on at least three different occasions. The authors recognize the constraints of existing approaches that rely heavily on assumptions such as the similarity of usernames or consistent opinion expression in posts. Thus, the paper adopts a ground-truth approach leveraging behavioral traces like IP addresses and session data.

From their identification process, the authors isolate 3,656 sockpuppets from a total of 2.9 million users, organized into 1,623 sockpuppet groups. Their analysis demonstrates that sockpuppets exhibit unique behavioral traits compared to ordinary users. They tend to generate more posts, but with reduced linguistic complexity, often utilizing more singular first-person pronouns and fewer negations. Sockpuppet activity is found to be focused in controversial topic areas, aligning with previous research that associates sockpuppetry with influence manipulation.

Implications of Sockpuppet Interactions and Behavior

The paper provides novel insights into the interactions between sockpuppets. They are more likely to post in the same threads simultaneously compared to pairs of ordinary users and exhibit higher network centrality in the thematic discussion networks they engage with. Notably, sockpuppet pairs tend to display more supportive and affirmative postures toward each other's contributions, often inflating perceived consensus or popularity through asymmetric support.

Beyond these general characteristics, the authors propose a taxonomy of sockpuppetry defined by differing levels of deceptiveness and supportiveness. The presence of "pretenders," who appear as distinctly different individuals, contrasts with "non-pretenders," who may not disguise their shared control. This duality extends to the emotional and linguistic content of their interactions, suggesting varied strategic uses of sockpuppetry.

Detection and Predictive Modeling of Sockpuppets

A significant contribution of this research is in developing predictive models for sockpuppet detection. Leveraging linguistic, activity-based, and community-derived features, the models achieve promising discriminative performance. Specifically, ROC AUC scores indicate high efficacy in identifying pairs of sockpuppets from regular accounts and moderate success in classifying individual sockpuppets. These predictive capabilities have critical implications for developing intervention tools targeting authenticity in online discourse.

Conclusion and Future Directions

In summation, the paper presents a seismic advancement in understanding and detecting online sockpuppetry. Although engaged communities can utilize these insights to foster healthier interaction ecosystems, the research also emphasizes caution against monolithic negative perceptions of sockpuppetry. The nuances in sockpuppet behavior, including non-deceptive uses, highlight the complexities in moderating online forums and the necessary balance between regulation and expression.

Future research could explore cross-platform sockpuppetry, leveraging multi-platform datasets, and exploring how identity fluidity affects sockpuppet engagement. There is also potential for integrating psychological and sociological frameworks to further elucidate the socio-behavioral underpinnings motivating individuals to engage in sockpuppetry. These directions will be pivotal in developing a more holistic understanding of identity formation and manipulation in digital environments.