Social-LLM: Modeling User Behavior at Scale using Language Models and Social Network Data (2401.00893v1)
Abstract: The proliferation of social network data has unlocked unprecedented opportunities for extensive, data-driven exploration of human behavior. The structural intricacies of social networks offer insights into various computational social science issues, particularly concerning social influence and information diffusion. However, modeling large-scale social network data comes with computational challenges. Though LLMs make it easier than ever to model textual content, any advanced network representation methods struggle with scalability and efficient deployment to out-of-sample users. In response, we introduce a novel approach tailored for modeling social network data in user detection tasks. This innovative method integrates localized social network interactions with the capabilities of LLMs. Operating under the premise of social network homophily, which posits that socially connected users share similarities, our approach is designed to address these challenges. We conduct a thorough evaluation of our method across seven real-world social network datasets, spanning a diverse range of topics and detection tasks, showcasing its applicability to advance research in computational social science.
- Sinan Aral and Dylan Walker. 2012. Identifying Influential and Susceptible Members of Social Networks. Science 337, 6092 (2012), 337–341.
- Tweeting From Left to Right: Is Online Political Communication More Than an Echo Chamber? Psychological Science 26, 10 (2015), 1531–1542.
- “It’s Not Just Hate”: A Multi-Dimensional Perspective on Detecting Harmful Speech Online. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP). 8093–8099. https://doi.org/10.18653/v1/2022.emnlp-main.553
- Do We Still Need BERT in the Age of GPT? Comparing the Benefits of Domain-Adaptation and In-Context-Learning Approaches to Using LLMs for Political Science Research. (2023).
- Tweet, Tweet, Retweet: Conversational Aspects of Retweeting on Twitter. In HICSS 2010. IEEE, 1–10.
- Political Ideology Predicts Perceptions of the Threat of COVID-19 (and Susceptibility to Fake News About It). Social Psychological and Personality Science 11, 8 (2020), 1119–1128. https://doi.org/10.1177/1948550620940539
- Eugene Y Chan. 2021. Moral Foundations Underlying Behavioral Compliance During the COVID-19 Pandemic. Personality and Individual Differences 171 (2021), 110463. https://doi.org/10.1016/j.paid.2020.110463
- #Election2020: The First Public Twitter Dataset on the 2020 US Presidential Election. Journal of Computational Social Science 5 (2022), 1–18. https://doi.org/10.1007/s42001-021-00117-9
- Emily Chen and Emilio Ferrara. 2023. Tweets in Time of Conflict: A Public Dataset Tracking the Twitter Discourse on the War Between Ukraine and Russia. In Proceedings of the International AAAI Conference on Web and Social Media (ICWSM), Vol. 17. 1006–1013. https://doi.org/10.1609/icwsm.v17i1.22208
- Tracking Social Media Discourse About the COVID-19 Pandemic: Development of a Public Coronavirus Twitter Data Set. JMIR Public Health and Surveillance 6, 2 (2020), e19273. https://doi.org/10.2196/19273
- Coalition for Independent Technology Research. 2023. Letter: Twitter’s New API Plans Will Devastate Public Interest Research. (3 April 2023). https://independenttechresearch.org/letter-twitters-new-api-plans-will-devastate-public-interest-research/ Accessed: October 11, 2023.
- Political Polarization on Twitter. In Proceedings of the International AAAI Conference on Web and Social Media (ICWSM), Vol. 5. 89–96.
- Botornot: A System to Evaluate Social Bots. In Proceedings of the 25th International Conference Companion on World Wide Web (WWW Companion). 273–274. https://doi.org/10.1145/2872518.2889302
- Purity Homophily in Social Networks. Journal of Experimental Psychology: General 145, 3 (2016), 366. https://doi.org/10.1037/xge0000139
- Rodrigo Díaz and Florian Cova. 2022. Reactance, Morality, and Disgust: The Relationship Between Affective Dispositions and Compliance With Official Health Recommendations During the COVID-19 Pandemic. Cognition and Emotion 36, 1 (2022), 120–136. https://doi.org/10.1080/02699931.2021.1941783
- The Rise of Social Bots. Commun. ACM 59, 7 (2016), 96–104.
- Kathryn B Francis and Carolyn B McNabb. 2022. Moral Decision-Making During COVID-19: Moral Judgements, Moralisation, and Everyday Behaviour. Frontiers in Psychology 12 (2022), 6484. https://doi.org/10.3389/fpsyg.2021.769177
- Linton C. Freeman. 2004. The Development of Social Network Analysis: A Study in the Sociology of Science. Empirical Press.
- Incivility is Rising Among American Politicians on Twitter. Social Psychological and Personality Science 14, 2 (2023), 259–269. https://doi.org/10.1177/19485506221083811
- Aditya Grover and Jure Leskovec. 2016. Node2vec: Scalable Feature Learning for Networks. In KDD 2016. 855–864.
- Jonathan Haidt. 2012. The Righteous Mind: Why Good people are Divided by Politics and Religion. Vintage.
- Jonathan Haidt and Craig Joseph. 2004. Intuitive Ethics: How Innately Prepared Intuitions Generate Culturally Variable Virtues. Daedalus 133, 4 (2004), 55–66.
- Inductive Representation Learning on Large Graphs. In NIPS 2017, Vol. 30.
- William L Hamilton. 2020. Graph representation learning. Morgan & Claypool Publishers.
- Tweeting to the Target: Candidates’ Use of Strategic Messages and @Mentions on Twitter. Journal of Information Technology & Politics 15, 1 (2018), 3–18.
- Efficient Natural Language Response Suggestion for Smart Reply. arXiv preprint arXiv:1705.00652 (2017).
- Moral Foundations Twitter Corpus: A Collection of 35k Tweets Annotated for Moral Sentiment. Social Psychological and Personality Science 11, 8 (2020), 1057–1071. https://doi.org/10.1177/1948550619876629
- Indiana University’s Observatory on Social Media. 2022. Suspicious Twitter Activity around the Russian Invasion of Ukraine. (2022). https://osome.iu.edu/research/white-papers/Ukraine_OSoMe_White_Paper_March_2022.pdf
- What are Your Pronouns? Examining Gender Pronoun Usage on Twitter. In Workshop Proceedings of the 17th International AAAI Conference on Web and Social Media. https://doi.org/10.36190/2023.02
- Political Polarization Drives Online Conversations About COVID-19 in the United States. Human Behavior and Emerging Technologies 2, 3 (2020), 200–211. https://doi.org/10.1002/hbe2.202
- Social Approval and Network Homophily as Motivators of Online Toxicity. arXiv preprint arXiv:2310.07779 (2023).
- Retweet-BERT: Political Leaning Detection Using Language Features and Information Diffusion on Social Networks. In Proceedings of the International AAAI Conference on Web and Social Media (ICWSM), Vol. 17. 459–469. https://doi.org/10.1609/icwsm.v17i1.22160
- The Distorting Prism of Social Media: How Self-Selection and Exposure to Incivility Fuel Online Comment Toxicity. Journal of Communication 71, 6 (2021), 922–946. https://doi.org/10.1093/joc/jqab034
- Balazs Kovacs and Adam M Kleinbaum. 2020. Language-Style Similarity and Social Networks. Psychological Science 31, 2 (2020), 202–213. https://doi.org/10.1177/0956797619894557
- Computational Social Science. Science 323, 5915 (2009), 721–723. https://doi.org/10.1126/science.1167742
- Roberta: A Robustly Optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692 (2019).
- Heterogeneous Graph Neural Networks for Malicious Account Detection. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management (CIKM ’18). 2077–2085. https://doi.org/10.1145/3269206.3272010
- Graph Neural Networks: Scalability. Graph Neural Networks: Foundations, Frontiers, and Applications (2022), 99–119.
- A Study on Twitter User-Follower Network: A Network Based Analysis. In Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. 1405–1409.
- Spammer Detection and Fake User Identification on Social Networks. IEEE Access 7 (2019), 68140–68152.
- Birds of a Feather: Homophily in Social Networks. Annual Review of Sociology 27, 1 (2001), 415–444.
- What do Retweets Indicate? Results From User Survey and Meta-Review of Research. In ICWSM 2015, Vol. 9. 658–661.
- BERTweet: A Pre-Trained Language Model for English Tweets. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP): System Demonstrations. 9–14. https://doi.org/10.18653/v1/2020.emnlp-demos.2
- Guangyuan Piao and John G Breslin. 2017. Inferring User Interests for Passive Users on Twitter by Leveraging Followee Biographies. In Advances in Information Retrieval: ECIR 2017. Springer, 122–133. https://doi.org/10.1007/978-3-319-56608-5_10
- How Does Twitter Account Moderation Work? Dynamics of Account Creation and Suspension on Twitter During Major Geopolitical Events. EPJ Data Science 12, 1 (2023), 43. https://doi.org/10.1140/epjds/s13688-023-00420-7
- Propaganda and Misinformation on Facebook and Twitter during the Russian Invasion of Ukraine. In Proceedings of the 15th ACM Web Science Conference 2023 (WebSci). 65–74. https://doi.org/10.1145/3578503.3583597
- Pandemic Culture Wars: Partisan Asymmetries in the Moral Language of COVID-19 Discussions. arXiv preprint arXiv:2305.18533 (2023). https://doi.org/10.48550/arXiv.2305.18533
- Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence Embeddings Using Siamese BERT-Networks. In EMNLP-IJCNLP 2019.
- Characterizing and Detecting Hateful Users on Twitter. In ICWSM 2018, Vol. 12.
- Semantically Enhanced Network Analysis for Influencer Identification in Online Social Networks. Neurocomputing 326 (2019), 71–81.
- Nick Rogers and Jason J Jones. 2021. Using Twitter Bios to Measure Changes in Self-Identity: Are Americans Defining Themselves More Politically Over Time? Journal of Social Computing 2, 1 (2021), 1–13.
- Marco Serafini and Hui Guan. 2021. Scalable Graph Neural Network Training: The Case for Sampling. ACM SIGOPS Operating Systems Review 55, 1 (2021), 68–76.
- MPNet: Masked and Permuted Pre-Training for Language Understanding. Advances in Neural Information Processing Systems 33 (2020), 16857–16867.
- A Multi-Modal Dataset for Hate Speech Detection on Social Media: Case-Study of Russia-Ukraine Conflict. In Proceedings of the 5th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE). https://doi.org/10.18653/v1/2022.case-1.1
- Male, Female, and Nonbinary Differences in UK Twitter Self-Descriptions: A Fine-Grained Systematic Exploration. Journal of Data and Information Science 6, 2 (2021), 1–27. https://doi.org/10.2478/jdis-2021-0018
- Laurens Van der Maaten and Geoffrey Hinton. 2008. Visualizing Data Using t-SNE. Journal of Machine Learning Research 9, 11 (2008).
- A Comprehensive Survey on Graph Neural Networks. IEEE Transactions on Neural Networks and Learning Systems 32, 1 (2021), 4–24. https://doi.org/10.1109/TNNLS.2020.2978386
- TIMME: Twitter Ideology-Detection via Multi-Task Multi-Relational Embedding. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2258–2268.
- Botometer 101: Social Bot Practicum for Computational Social Scientists. Journal of Computational Social Science 5 (2022), 1511–1528. https://doi.org/10.1007/s42001-022-00177-5
- Susceptibility to Unreliable Information Sources: Swift Adoption with Minimal Exposure. arXiv preprint arXiv:2311.05724 (2023).
- Prone: Fast and Scalable Network Representation Learning. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI-19), Vol. 19. 4278–4284. https://doi.org/10.24963/ijcai.2019/594
- Julie Jiang (17 papers)
- Emilio Ferrara (197 papers)