Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Vax-Culture: A Dataset for Studying Vaccine Discourse on Twitter (2304.06858v3)

Published 13 Apr 2023 in cs.SI, cs.CL, and cs.LG

Abstract: Vaccine hesitancy continues to be a main challenge for public health officials during the COVID-19 pandemic. As this hesitancy undermines vaccine campaigns, many researchers have sought to identify its root causes, finding that the increasing volume of anti-vaccine misinformation on social media platforms is a key element of this problem. We explored Twitter as a source of misleading content with the goal of extracting overlapping cultural and political beliefs that motivate the spread of vaccine misinformation. To do this, we have collected a data set of vaccine-related Tweets and annotated them with the help of a team of annotators with a background in communications and journalism. Ultimately we hope this can lead to effective and targeted public health communication strategies for reaching individuals with anti-vaccine beliefs. Moreover, this information helps with developing Machine Learning models to automatically detect vaccine misinformation posts and combat their negative impacts. In this paper, we present Vax-Culture, a novel Twitter COVID-19 dataset consisting of 6373 vaccine-related tweets accompanied by an extensive set of human-provided annotations including vaccine-hesitancy stance, indication of any misinformation in tweets, the entities criticized and supported in each tweet and the communicated message of each tweet. Moreover, we define five baseline tasks including four classification and one sequence generation tasks, and report the results of a set of recent transformer-based models for them. The dataset and code are publicly available at https://github.com/mrzarei5/Vax-Culture.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (26)
  1. S. Nyawa, D. Tchuente, and S. Fosso-Wamba, “Covid-19 vaccine hesitancy: a social media analysis using deep learning,” Annals of Operations Research, pp. 1–39, 2022.
  2. J. D. Featherstone and J. Zhang, “Feeling angry: the effects of vaccine misinformation and refutational messages on negative emotions and vaccination attitude,” Journal of Health Communication, vol. 25, no. 9, pp. 692–702, 2020.
  3. A. Jamison, D. A. Broniatowski, M. C. Smith, K. S. Parikh, A. Malik, M. Dredze, and S. C. Quinn, “Adapting and extending a typology to identify vaccine misinformation on twitter,” American Journal of Public Health, vol. 110, no. S3, pp. S331–S339, 2020.
  4. N. Smith and T. Graham, “Mapping the anti-vaccination movement on facebook,” Information, Communication & Society, vol. 22, no. 9, pp. 1310–1327, 2019.
  5. F. Polletta and J. Callahan, “Deep stories, nostalgia narratives, and fake news: Storytelling in the trump era,” in Politics of meaning/meaning of politics.   Springer, 2019, pp. 55–73.
  6. D. Boyd, “You think you want media literacy… do you,” Retrived from https://points. datasociety. net/you-think-you-want-medialiteracy-do-you-7cad6af18ec2, 2018.
  7. M. Hindman and V. Barash, “Disinformation,’fake news’ and influence campaigns on twitter,” 2018.
  8. W. Phillips, “The oxygen of amplification,” Data & Society, vol. 22, pp. 1–128, 2018.
  9. R. Lamsal, “Design and analysis of a large-scale covid-19 tweets dataset,” applied intelligence, vol. 51, no. 5, pp. 2790–2804, 2021.
  10. E. Chen, K. Lerman, E. Ferrara et al., “Tracking social media discourse about the covid-19 pandemic: Development of a public coronavirus twitter data set,” JMIR public health and surveillance, vol. 6, no. 2, p. e19273, 2020.
  11. C. E. Lopez and C. Gallemore, “An augmented multilingual twitter dataset for studying the covid-19 infodemic,” Social Network Analysis and Mining, vol. 11, no. 1, pp. 1–14, 2021.
  12. R. K. Gupta, A. Vishwanath, and Y. Yang, “Covid-19 twitter dataset with latent topics, sentiments and emotions attributes,” arXiv preprint arXiv:2007.06954, 2020.
  13. A. Hussain, A. Tahir, Z. Hussain, Z. Sheikh, M. Gogate, K. Dashtipour, A. Ali, A. Sheikh et al., “Artificial intelligence–enabled analysis of public attitudes on facebook and twitter toward covid-19 vaccines in the united kingdom and the united states: Observational study,” Journal of medical Internet research, vol. 23, no. 4, p. e26627, 2021.
  14. L.-A. Cotfas, C. Delcea, and R. Gherai, “Covid-19 vaccine hesitancy in the month following the start of the vaccination process,” International Journal of Environmental Research and Public Health, vol. 18, no. 19, p. 10438, 2021.
  15. L.-A. Cotfas, C. Delcea, I. Roxin, C. Ioanăş, D. S. Gherai, and F. Tajariol, “The longest month: analyzing covid-19 vaccination opinions dynamics from tweets in the month following the first vaccine announcement,” Ieee Access, vol. 9, pp. 33 203–33 223, 2021.
  16. K. Hayawi, S. Shahriar, M. A. Serhani, I. Taleb, and S. S. Mathew, “Anti-vax: a novel twitter dataset for covid-19 vaccine misinformation detection,” Public health, vol. 203, pp. 23–30, 2022.
  17. J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint arXiv:1810.04805, 2018.
  18. Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, and V. Stoyanov, “Roberta: A robustly optimized bert pretraining approach,” arXiv preprint arXiv:1907.11692, 2019.
  19. D. Q. Nguyen, T. Vu, and A. T. Nguyen, “Bertweet: A pre-trained language model for english tweets,” arXiv preprint arXiv:2005.10200, 2020.
  20. Y. Zhu, R. Kiros, R. Zemel, R. Salakhutdinov, R. Urtasun, A. Torralba, and S. Fidler, “Aligning books and movies: Towards story-like visual explanations by watching movies and reading books,” in Proceedings of the IEEE international conference on computer vision, 2015, pp. 19–27.
  21. S. Nagel, “Cc-news,” URL: http://web. archive. org/save/http://commoncrawl. org/2016/10/newsdatasetavailable, 2016.
  22. A. Gokaslan, V. Cohen, E. Pavlick, and S. Tellex, “Openwebtext corpus,” 2019.
  23. T. H. Trinh and Q. V. Le, “A simple method for commonsense reasoning,” arXiv preprint arXiv:1806.02847, 2018.
  24. M. Lewis, Y. Liu, N. Goyal, M. Ghazvininejad, A. Mohamed, O. Levy, V. Stoyanov, and L. Zettlemoyer, “Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension,” arXiv preprint arXiv:1910.13461, 2019.
  25. C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, Y. Zhou, W. Li, P. J. Liu et al., “Exploring the limits of transfer learning with a unified text-to-text transformer.” J. Mach. Learn. Res., vol. 21, no. 140, pp. 1–67, 2020.
  26. S. Bird, E. Loper, and E. Klein, “Natural language processing with python o’reilly media inc,” 2009.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Mohammad Reza Zarei (4 papers)
  2. Michael Christensen (1 paper)
  3. Sarah Everts (1 paper)
  4. Majid Komeili (11 papers)
Citations (1)
Github Logo Streamline Icon: https://streamlinehq.com