Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Social-LLM: Modeling User Behavior at Scale using Language Models and Social Network Data (2401.00893v1)

Published 31 Dec 2023 in cs.SI and cs.AI

Abstract: The proliferation of social network data has unlocked unprecedented opportunities for extensive, data-driven exploration of human behavior. The structural intricacies of social networks offer insights into various computational social science issues, particularly concerning social influence and information diffusion. However, modeling large-scale social network data comes with computational challenges. Though LLMs make it easier than ever to model textual content, any advanced network representation methods struggle with scalability and efficient deployment to out-of-sample users. In response, we introduce a novel approach tailored for modeling social network data in user detection tasks. This innovative method integrates localized social network interactions with the capabilities of LLMs. Operating under the premise of social network homophily, which posits that socially connected users share similarities, our approach is designed to address these challenges. We conduct a thorough evaluation of our method across seven real-world social network datasets, spanning a diverse range of topics and detection tasks, showcasing its applicability to advance research in computational social science.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (61)
  1. Sinan Aral and Dylan Walker. 2012. Identifying Influential and Susceptible Members of Social Networks. Science 337, 6092 (2012), 337–341.
  2. Tweeting From Left to Right: Is Online Political Communication More Than an Echo Chamber? Psychological Science 26, 10 (2015), 1531–1542.
  3. “It’s Not Just Hate”: A Multi-Dimensional Perspective on Detecting Harmful Speech Online. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP). 8093–8099. https://doi.org/10.18653/v1/2022.emnlp-main.553
  4. Do We Still Need BERT in the Age of GPT? Comparing the Benefits of Domain-Adaptation and In-Context-Learning Approaches to Using LLMs for Political Science Research. (2023).
  5. Tweet, Tweet, Retweet: Conversational Aspects of Retweeting on Twitter. In HICSS 2010. IEEE, 1–10.
  6. Political Ideology Predicts Perceptions of the Threat of COVID-19 (and Susceptibility to Fake News About It). Social Psychological and Personality Science 11, 8 (2020), 1119–1128. https://doi.org/10.1177/1948550620940539
  7. Eugene Y Chan. 2021. Moral Foundations Underlying Behavioral Compliance During the COVID-19 Pandemic. Personality and Individual Differences 171 (2021), 110463. https://doi.org/10.1016/j.paid.2020.110463
  8. #Election2020: The First Public Twitter Dataset on the 2020 US Presidential Election. Journal of Computational Social Science 5 (2022), 1–18. https://doi.org/10.1007/s42001-021-00117-9
  9. Emily Chen and Emilio Ferrara. 2023. Tweets in Time of Conflict: A Public Dataset Tracking the Twitter Discourse on the War Between Ukraine and Russia. In Proceedings of the International AAAI Conference on Web and Social Media (ICWSM), Vol. 17. 1006–1013. https://doi.org/10.1609/icwsm.v17i1.22208
  10. Tracking Social Media Discourse About the COVID-19 Pandemic: Development of a Public Coronavirus Twitter Data Set. JMIR Public Health and Surveillance 6, 2 (2020), e19273. https://doi.org/10.2196/19273
  11. Coalition for Independent Technology Research. 2023. Letter: Twitter’s New API Plans Will Devastate Public Interest Research. (3 April 2023). https://independenttechresearch.org/letter-twitters-new-api-plans-will-devastate-public-interest-research/ Accessed: October 11, 2023.
  12. Political Polarization on Twitter. In Proceedings of the International AAAI Conference on Web and Social Media (ICWSM), Vol. 5. 89–96.
  13. Botornot: A System to Evaluate Social Bots. In Proceedings of the 25th International Conference Companion on World Wide Web (WWW Companion). 273–274. https://doi.org/10.1145/2872518.2889302
  14. Purity Homophily in Social Networks. Journal of Experimental Psychology: General 145, 3 (2016), 366. https://doi.org/10.1037/xge0000139
  15. Rodrigo Díaz and Florian Cova. 2022. Reactance, Morality, and Disgust: The Relationship Between Affective Dispositions and Compliance With Official Health Recommendations During the COVID-19 Pandemic. Cognition and Emotion 36, 1 (2022), 120–136. https://doi.org/10.1080/02699931.2021.1941783
  16. The Rise of Social Bots. Commun. ACM 59, 7 (2016), 96–104.
  17. Kathryn B Francis and Carolyn B McNabb. 2022. Moral Decision-Making During COVID-19: Moral Judgements, Moralisation, and Everyday Behaviour. Frontiers in Psychology 12 (2022), 6484. https://doi.org/10.3389/fpsyg.2021.769177
  18. Linton C. Freeman. 2004. The Development of Social Network Analysis: A Study in the Sociology of Science. Empirical Press.
  19. Incivility is Rising Among American Politicians on Twitter. Social Psychological and Personality Science 14, 2 (2023), 259–269. https://doi.org/10.1177/19485506221083811
  20. Aditya Grover and Jure Leskovec. 2016. Node2vec: Scalable Feature Learning for Networks. In KDD 2016. 855–864.
  21. Jonathan Haidt. 2012. The Righteous Mind: Why Good people are Divided by Politics and Religion. Vintage.
  22. Jonathan Haidt and Craig Joseph. 2004. Intuitive Ethics: How Innately Prepared Intuitions Generate Culturally Variable Virtues. Daedalus 133, 4 (2004), 55–66.
  23. Inductive Representation Learning on Large Graphs. In NIPS 2017, Vol. 30.
  24. William L Hamilton. 2020. Graph representation learning. Morgan & Claypool Publishers.
  25. Tweeting to the Target: Candidates’ Use of Strategic Messages and @Mentions on Twitter. Journal of Information Technology & Politics 15, 1 (2018), 3–18.
  26. Efficient Natural Language Response Suggestion for Smart Reply. arXiv preprint arXiv:1705.00652 (2017).
  27. Moral Foundations Twitter Corpus: A Collection of 35k Tweets Annotated for Moral Sentiment. Social Psychological and Personality Science 11, 8 (2020), 1057–1071. https://doi.org/10.1177/1948550619876629
  28. Indiana University’s Observatory on Social Media. 2022. Suspicious Twitter Activity around the Russian Invasion of Ukraine. (2022). https://osome.iu.edu/research/white-papers/Ukraine_OSoMe_White_Paper_March_2022.pdf
  29. What are Your Pronouns? Examining Gender Pronoun Usage on Twitter. In Workshop Proceedings of the 17th International AAAI Conference on Web and Social Media. https://doi.org/10.36190/2023.02
  30. Political Polarization Drives Online Conversations About COVID-19 in the United States. Human Behavior and Emerging Technologies 2, 3 (2020), 200–211. https://doi.org/10.1002/hbe2.202
  31. Social Approval and Network Homophily as Motivators of Online Toxicity. arXiv preprint arXiv:2310.07779 (2023).
  32. Retweet-BERT: Political Leaning Detection Using Language Features and Information Diffusion on Social Networks. In Proceedings of the International AAAI Conference on Web and Social Media (ICWSM), Vol. 17. 459–469. https://doi.org/10.1609/icwsm.v17i1.22160
  33. The Distorting Prism of Social Media: How Self-Selection and Exposure to Incivility Fuel Online Comment Toxicity. Journal of Communication 71, 6 (2021), 922–946. https://doi.org/10.1093/joc/jqab034
  34. Balazs Kovacs and Adam M Kleinbaum. 2020. Language-Style Similarity and Social Networks. Psychological Science 31, 2 (2020), 202–213. https://doi.org/10.1177/0956797619894557
  35. Computational Social Science. Science 323, 5915 (2009), 721–723. https://doi.org/10.1126/science.1167742
  36. Roberta: A Robustly Optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692 (2019).
  37. Heterogeneous Graph Neural Networks for Malicious Account Detection. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management (CIKM ’18). 2077–2085. https://doi.org/10.1145/3269206.3272010
  38. Graph Neural Networks: Scalability. Graph Neural Networks: Foundations, Frontiers, and Applications (2022), 99–119.
  39. A Study on Twitter User-Follower Network: A Network Based Analysis. In Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. 1405–1409.
  40. Spammer Detection and Fake User Identification on Social Networks. IEEE Access 7 (2019), 68140–68152.
  41. Birds of a Feather: Homophily in Social Networks. Annual Review of Sociology 27, 1 (2001), 415–444.
  42. What do Retweets Indicate? Results From User Survey and Meta-Review of Research. In ICWSM 2015, Vol. 9. 658–661.
  43. BERTweet: A Pre-Trained Language Model for English Tweets. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP): System Demonstrations. 9–14. https://doi.org/10.18653/v1/2020.emnlp-demos.2
  44. Guangyuan Piao and John G Breslin. 2017. Inferring User Interests for Passive Users on Twitter by Leveraging Followee Biographies. In Advances in Information Retrieval: ECIR 2017. Springer, 122–133. https://doi.org/10.1007/978-3-319-56608-5_10
  45. How Does Twitter Account Moderation Work? Dynamics of Account Creation and Suspension on Twitter During Major Geopolitical Events. EPJ Data Science 12, 1 (2023), 43. https://doi.org/10.1140/epjds/s13688-023-00420-7
  46. Propaganda and Misinformation on Facebook and Twitter during the Russian Invasion of Ukraine. In Proceedings of the 15th ACM Web Science Conference 2023 (WebSci). 65–74. https://doi.org/10.1145/3578503.3583597
  47. Pandemic Culture Wars: Partisan Asymmetries in the Moral Language of COVID-19 Discussions. arXiv preprint arXiv:2305.18533 (2023). https://doi.org/10.48550/arXiv.2305.18533
  48. Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence Embeddings Using Siamese BERT-Networks. In EMNLP-IJCNLP 2019.
  49. Characterizing and Detecting Hateful Users on Twitter. In ICWSM 2018, Vol. 12.
  50. Semantically Enhanced Network Analysis for Influencer Identification in Online Social Networks. Neurocomputing 326 (2019), 71–81.
  51. Nick Rogers and Jason J Jones. 2021. Using Twitter Bios to Measure Changes in Self-Identity: Are Americans Defining Themselves More Politically Over Time? Journal of Social Computing 2, 1 (2021), 1–13.
  52. Marco Serafini and Hui Guan. 2021. Scalable Graph Neural Network Training: The Case for Sampling. ACM SIGOPS Operating Systems Review 55, 1 (2021), 68–76.
  53. MPNet: Masked and Permuted Pre-Training for Language Understanding. Advances in Neural Information Processing Systems 33 (2020), 16857–16867.
  54. A Multi-Modal Dataset for Hate Speech Detection on Social Media: Case-Study of Russia-Ukraine Conflict. In Proceedings of the 5th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE). https://doi.org/10.18653/v1/2022.case-1.1
  55. Male, Female, and Nonbinary Differences in UK Twitter Self-Descriptions: A Fine-Grained Systematic Exploration. Journal of Data and Information Science 6, 2 (2021), 1–27. https://doi.org/10.2478/jdis-2021-0018
  56. Laurens Van der Maaten and Geoffrey Hinton. 2008. Visualizing Data Using t-SNE. Journal of Machine Learning Research 9, 11 (2008).
  57. A Comprehensive Survey on Graph Neural Networks. IEEE Transactions on Neural Networks and Learning Systems 32, 1 (2021), 4–24. https://doi.org/10.1109/TNNLS.2020.2978386
  58. TIMME: Twitter Ideology-Detection via Multi-Task Multi-Relational Embedding. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2258–2268.
  59. Botometer 101: Social Bot Practicum for Computational Social Scientists. Journal of Computational Social Science 5 (2022), 1511–1528. https://doi.org/10.1007/s42001-022-00177-5
  60. Susceptibility to Unreliable Information Sources: Swift Adoption with Minimal Exposure. arXiv preprint arXiv:2311.05724 (2023).
  61. Prone: Fast and Scalable Network Representation Learning. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI-19), Vol. 19. 4278–4284. https://doi.org/10.24963/ijcai.2019/594
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Julie Jiang (17 papers)
  2. Emilio Ferrara (197 papers)
Citations (4)