Uncovering Latent Arguments in Social Media Messaging by Employing LLMs-in-the-Loop Strategy (2404.10259v3)
Abstract: The widespread use of social media has led to a surge in popularity for automated methods of analyzing public opinion. Supervised methods are adept at text categorization, yet the dynamic nature of social media discussions poses a continual challenge for these techniques due to the constant shifting of the focus. On the other hand, traditional unsupervised methods for extracting themes from public discourse, such as topic modeling, often reveal overarching patterns that might not capture specific nuances. Consequently, a significant portion of research into social media discourse still depends on labor-intensive manual coding techniques and a human-in-the-loop approach, which are both time-consuming and costly. In this work, we study the problem of discovering arguments associated with a specific theme. We propose a generic LLMs-in-the-Loop strategy that leverages the advanced capabilities of LLMs to extract latent arguments from social media messaging. To demonstrate our approach, we apply our framework to contentious topics. We use two publicly available datasets: (1) the climate campaigns dataset of 14k Facebook ads with 25 themes and (2) the COVID-19 vaccine campaigns dataset of 9k Facebook ads with 14 themes. Additionally, we design a downstream task as stance prediction by leveraging talking points in climate debates. Furthermore, we analyze demographic targeting and the adaptation of messaging based on real-world events.
- Gpt-4 technical report. arXiv preprint arXiv:2303.08774, 2023.
- Oana Barbu. Advertising, microtargeting and social media. Procedia-Social and Behavioral Sciences, 163:44–49, 2014.
- Eddi: interactive topic-based browsing of social status streams. In Proceedings of the 23nd annual ACM symposium on User interface software and technology, pp. 303–312, 2010.
- Argument mining on twitter: A case study on the planned parenthood debate. In Proceedings of the 8th Workshop on Argument Mining, pp. 1–11, 2021.
- Latent dirichlet allocation. the Journal of machine Learning research, 3:993–1022, 2003.
- Targeting and tailoring climate change communications. Wiley Interdisciplinary Reviews: Climate Change, 4(5):447–455, 2013.
- Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
- Clandestino or rifugiato? anti-immigration facebook ad targeting in italy. In CHI, 2021.
- Can large language models be an alternative to human evaluations? In The 61st Annual Meeting Of The Association For Computational Linguistics, 2023.
- Palm: Scaling language modeling with pathways. Journal of Machine Learning Research, 24(240):1–113, 2023.
- Llm-in-the-loop: Leveraging large language model for thematic analysis. arXiv preprint arXiv:2310.15100, 2023.
- Stefano De Paoli. Can large language models emulate an inductive thematic analysis of semi-structured interviews? an exploration and provocation on the limits of the approach and the model. arXiv preprint arXiv:2305.13014, 2023.
- Online deliberation and the public sphere: Developing a coding manual to assess deliberation in twitter political networks. Javnost-The Public, 27(3):211–229, 2020.
- The spreading of misinformation online. Proceedings of the national academy of Sciences, 113(3):554–559, 2016.
- Argument mining on twitter: Arguments, facts and sources. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 2317–2322, 2017.
- From facebook to the streets: Russian troll ads and black lives matter protests. In 52nd Hawaii International Conference on System Sciences, 2019.
- Characterization of local attitudes toward immigration using social media. In Companion Proceedings of The 2019 World Wide Web Conference, pp. 783–790, 2019.
- Political discourse on social media: Echo chambers, gatekeepers, and the price of bipartisanship. In Proceedings of the 2018 World Wide Web Conference, pp. 913–922. International World Wide Web Conferences Steering Committee, 2018.
- Chatgpt outperforms crowd workers for text-annotation tasks. Proceedings of the National Academy of Sciences, 120(30):e2305016120, 2023.
- Public opinion. The international encyclopedia of communication, 2008.
- The argument reasoning comprehension task: Identification and reconstruction of implicit warrants. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 1930–1940, 2018.
- The role of influential actors in fostering the polarized covid-19 vaccine discourse on twitter: Mixed methods of machine learning and inductive coding. Jmir Infodemiology, 2(1):e34231, 2022.
- When morality opposes justice: Conservatives have moral intuitions that liberals may not recognize. Social Justice Research, 20(1):98–116, 2007.
- Intuitive ethics: How innately prepared intuitions generate culturally variable virtues. Daedalus, 133(4):55–66, 2004.
- Using social media to mine and analyze public opinion related to covid-19 in china. International journal of environmental research and public health, 17(8):2788, 2020.
- Eitan D Hersh. Hacking the electorate: How campaigns perceive voters. Cambridge University Press, 2015.
- Interactive topic modeling. Machine learning, 95:423–469, 2014.
- Does yoga make you happy? analyzing twitter user happiness using textual and temporal information. In 2020 IEEE International Conference on Big Data (Big Data), pp. 4241–4249. IEEE, 2020.
- Understanding covid-19 vaccine campaign on facebook using minimal supervision. In 2022 IEEE International Conference on Big Data (Big Data), pp. 585–595. IEEE, 2022.
- Uncovering latent themes of messaging on social media by integrating llms: A case study on climate campaigns. arXiv preprint arXiv:2403.10707, 2024.
- Weakly supervised learning for analyzing political campaigns on facebook. In Proceedings of the International AAAI Conference on Web and Social Media, volume 17, pp. 411–422, 2023a.
- Analysis of climate campaigns on social media using bayesian model averaging. In Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society, AIES ’23, pp. 15–25, New York, NY, USA, 2023b. Association for Computing Machinery. ISBN 9798400702310. doi: 10.1145/3600211.3604665. URL https://doi.org/10.1145/3600211.3604665.
- Echo chamber: Rush Limbaugh and the conservative media establishment. Oxford University Press, 2008.
- Political polarization drives online conversations about covid-19 in the united states. Human Behavior and Emerging Technologies, 2(3):200–211, 2020.
- K-Means Clustering, pp. 563–564. Springer US, Boston, MA, 2010. ISBN 978-0-387-30164-8. doi: 10.1007/978-0-387-30164-8˙425. URL https://doi.org/10.1007/978-0-387-30164-8_425.
- Automatic identification of pro and con reasons in online reviews. In Proceedings of the COLING/ACL 2006 main conference poster sessions, pp. 483–490, 2006.
- Large language models are zero-shot reasoners. Advances in neural information processing systems, 35:22199–22213, 2022.
- Ruarg-2022: Argument mining evaluation. arXiv preprint arXiv:2206.09249, 2022.
- Political microtargeting: relationship between personalized advertising on facebook and voters’ responses. Cyberpsychology, Behavior, and Social Networking, 19(6):367–372, 2016.
- Bloom: A 176b-parameter open-access multilingual language model. 2022.
- Learning the parts of objects by non-negative matrix factorization. Nature, 401(6755):788–791, 1999.
- Opinion mining on social media data. In 2013 IEEE 14th international conference on mobile data management, volume 2, pp. 91–96. IEEE, 2013.
- Inside the right-leaning echo chambers: Characterizing gab, an unmoderated social system. In 2018 ieee/acm international conference on advances in social networks analysis and mining (asonam), pp. 515–522. IEEE, 2018.
- Bing Liu. Sentiment analysis and opinion mining. Springer Nature, 2022.
- Tandem anchoring: A multiword anchor approach for interactive topic modeling. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 896–905, 2017.
- A look into covid-19 vaccination debate on twitter. In Proceedings of the 13th ACM Web Science Conference 2021, pp. 225–233, 2021.
- Challenges in developing opinion mining tools for social media. Proceedings of@ NLP can u tag# usergeneratedcontent, pp. 15–22, 2012.
- Shannon C McGregor. Social media as public opinion: How journalists use social media to represent public opinion. Journalism, 20(8):1070–1086, 2019.
- Audience response to values-based marketplace advocacy by the fossil fuel industries. Environmental Communication, 10(2):249–268, 2016.
- Social media in public opinion research: Executive summary of the aapor task force on emerging technologies in public opinion research. Public Opinion Quarterly, 78(4):788–794, 2014.
- Progress and push-back: How the killings of ahmaud arbery, breonna taylor, and george floyd impacted public discourse on race and racism on twitter. SSM - Population Health, 15:100922, 2021. ISSN 2352-8273. doi: https://doi.org/10.1016/j.ssmph.2021.100922. URL https://www.sciencedirect.com/science/article/pii/S235282732100197X.
- A holistic framework for analyzing the covid-19 vaccine debate. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 5821–5839, 2022a.
- Interactively uncovering latent arguments in social media platforms: A case study on the covid-19 vaccine debate. In Proceedings of the Fourth Workshop on Data Science with Human-in-the-Loop (Language Advances), pp. 94–111, 2022b.
- Interactive concept learning for uncovering latent themes in large text collections. In Anna Rogers, Jordan Boyd-Graber, and Naoaki Okazaki (eds.), Findings of the Association for Computational Linguistics: ACL 2023, pp. 5059–5080, Toronto, Canada, July 2023. Association for Computational Linguistics. doi: 10.18653/v1/2023.findings-acl.313. URL https://aclanthology.org/2023.findings-acl.313.
- Vincent Price. On the public aspects of opinion: Linking levels of analysis in public opinion research. Communication research, 15(6):659–679, 1988.
- Echo chambers on facebook. Available at SSRN 2795110, 2016.
- Sentence-BERT: Sentence embeddings using Siamese BERT-networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 3982–3992, Hong Kong, China, November 2019. Association for Computational Linguistics. doi: 10.18653/v1/D19-1410. URL https://aclanthology.org/D19-1410.
- On microtargeting socially divisive ads: A case study of russia-linked ad campaigns on facebook. In ACM FAccT, 2019.
- Vaccine hesitancy in discussion forums: computer-assisted argument mining with topic models. In Building Continents of Knowledge in Oceans of Data: The Future of Co-Created eHealth, pp. 366–370. IOS Press, 2018.
- Opinion mining in social media: Modeling, simulating, and forecasting political opinions in the web. Government information quarterly, 29(4):470–479, 2012.
- Covid-19 vaccine hesitancy in poland—multifactorial impact trajectories. Vaccines, 9(8):876, 2021.
- Identifying argumentative discourse structures in persuasive essays. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp. 46–56, 2014.
- Annotating claims in the vaccination debate. In Proceedings of the 5th Workshop on Argument Mining, pp. 47–56, 2018.
- What arguments against covid-19 vaccines run on facebook in poland: Content analysis of comments. Vaccines, 9(5):481, 2021.
- Opinion formation on social media: an empirical approach. Chaos: An Interdisciplinary Journal of Nonlinear Science, 24(1), 2014.
- An analysis of covid-19 vaccine sentiments and opinions on twitter. International Journal of Infectious Diseases, 108:256–262, 2021.
- Can large language models transform computational social science? Computational Linguistics, pp. 1–55, 2024.
- Online political microtargeting: Promises and threats for democracy. Utrecht Law Review, 14(1):82–96, 2018.
- Tunazzina Islam (15 papers)
- Dan Goldwasser (48 papers)