Papers
Topics
Authors
Recent
2000 character limit reached

Socially-Informed Content Analysis of Online Human Behavior (2509.10807v1)

Published 13 Sep 2025 in cs.SI

Abstract: The explosive growth of social media has not only revolutionized communication but also brought challenges such as political polarization, misinformation, hate speech, and echo chambers. This dissertation employs computational social science techniques to investigate these issues, understand the social dynamics driving negative online behaviors, and propose data-driven solutions for healthier digital interactions. I begin by introducing a scalable social network representation learning method that integrates user-generated content with social connections to create unified user embeddings, enabling accurate prediction and visualization of user attributes, communities, and behavioral propensities. Using this tool, I explore three interrelated problems: 1) COVID-19 discourse on Twitter, revealing polarization and asymmetric political echo chambers; 2) online hate speech, suggesting the pursuit of social approval motivates toxic behavior; and 3) moral underpinnings of COVID-19 discussions, uncovering patterns of moral homophily and echo chambers, while also indicating moral diversity and plurality can improve message reach and acceptance across ideological divides. These findings contribute to the advancement of computational social science and provide a foundation for understanding human behavior through the lens of social interactions and network homophily.

Summary

  • The paper introduces Social-LLM, a scalable method integrating user content and social connections to generate unified user embeddings.
  • It demonstrates improved prediction of user attributes and community behaviors across diverse case studies such as COVID-19 polarization and online hate speech.
  • The analysis reveals that reinforcing network norms and moral diversity can mitigate online toxicity and reduce ideological echo chambers.

Socially-Informed Content Analysis of Online Human Behavior

Overview

The paper "Socially-Informed Content Analysis of Online Human Behavior" presents an analysis of social dynamics on online platforms, notably Twitter, and introduces methodologies to understand and mitigate negative online behavior. This work emphasizes the role of large-scale social network representation learning in predicting user behavior and attributes by integrating user-generated content with social connections.

Social Network Representation Learning

The cornerstone of this research is a novel, scalable social network representation learning method named Social-LLM. The method leverages user content and social interactions to create unified user embeddings. This approach enhances the predictability of user attributes, community memberships, and behavioral tendencies. Social-LLM is tested across diverse datasets demonstrating superior performance and adaptability in various scenarios. Figure 1

Figure 1: Overview of the Social-LLM method.

Application in COVID-19 Discourse

The paper explores three focused case studies:

  1. COVID-19 Polarization:
    • It examines the role of political partisanship in the Twitter discourse about COVID-19 and reveals the existence of ideological echo chambers. Right-leaning users were found to form a more cohesive and isolated group, whereas left-leaning and neutral users displayed more diverse interactions.
    • User embeddings generated by the model provided insights into the structural basis of these echo chambers and the spread of misinformation among partisan groups. Figure 2

      Figure 2: COVID-19 dataset statistics of left-leaning (bottom 20\%), neutral (middle 20\%), and right-leaning (top 20\%) users partitioned by their verification status.

Online Hate Speech

The research also explores the propagation of online toxicity, uncovering that the pursuit of social approval heavily incites the perpetration of hate messages. Users are shown to increase their engagement in toxic behavior based on the positive reinforcement from their social networks.

  • The study finds profound implications for understanding how online behaviors are influenced by patterns of reinforcement and network position, underscoring the interplay between individual actions and broader social norms. Figure 3

    Figure 3: When an anchor tweet in the hate speech dataset receives substantially lower (red) or higher (blue) amount of retweets than expected, the difference in maximum toxicity (k=50) is statistically significant.

Morality and User Behavior

A significant contribution of this paper is its investigation into the moral dimensions of online communication patterns, particularly concerning COVID-19 discussions.

  • The research establishes that moral foundation theory can explain the formation of homophilic connections, where users with similar moral values tend to engage more frequently.
  • Notably, messages emphasizing moral diversity reach wider audiences across ideological lines, suggesting effective strategies for cross-cutting communication. Figure 4

    Figure 4: The average moral z-scores of each foundation for the four user groups in the COVID-19 dataset.

Conclusion

This study advances our understanding of online social dynamics through the development of scalable methodologies tailored for large-scale data. By revealing the structural and functional aspects of social platforms, it provides actionable insights for mitigating negative behavior and fostering healthier interactions. Future work is suggested to focus on refining these techniques and exploring their applications in broader contexts beyond social media.

Implications

Social-LLM’s capacity to generalize across tasks and datasets exemplifies the promise of integrating content and structural data in computational social science. Platforms can tailor this knowledge to improve content moderation and facilitate open, respectful discourse. Policymakers may leverage these insights to develop informed regulations that balance user freedom and harm mitigation.

In summary, this paper lays a robust groundwork for future explorations into the nuanced ways digital interactions are shaping human behavior, shining light on the path towards more empathetic and informed engagement in the digital field.

Whiteboard

Paper to Video (Beta)

Authors (1)

Collections

Sign up for free to add this paper to one or more collections.