Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
109 tokens/sec
GPT-4o
12 tokens/sec
Gemini 2.5 Pro Pro
35 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
5 tokens/sec
DeepSeek R1 via Azure Pro
33 tokens/sec
2000 character limit reached

Hierarchical Multi-Label Classification of Online Vaccine Concerns (2402.01783v1)

Published 1 Feb 2024 in cs.CL, cs.AI, and cs.LG

Abstract: Vaccine concerns are an ever-evolving target, and can shift quickly as seen during the COVID-19 pandemic. Identifying longitudinal trends in vaccine concerns and misinformation might inform the healthcare space by helping public health efforts strategically allocate resources or information campaigns. We explore the task of detecting vaccine concerns in online discourse using LLMs in a zero-shot setting without the need for expensive training datasets. Since real-time monitoring of online sources requires large-scale inference, we explore cost-accuracy trade-offs of different prompting strategies and offer concrete takeaways that may inform choices in system designs for current applications. An analysis of different prompting strategies reveals that classifying the concerns over multiple passes through the LLM, each consisting a boolean question whether the text mentions a vaccine concern or not, works the best. Our results indicate that GPT-4 can strongly outperform crowdworker accuracy when compared to ground truth annotations provided by experts on the recently introduced VaxConcerns dataset, achieving an overall F1 score of 78.7%.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (15)
  1. 2019. Crowdsourcing multi-label audio annotation tasks with citizen scientists.
  2. 2021. The Longest Month: Analyzing COVID-19 Vaccination Opinions Dynamics From Tweets in the Month Following the First Vaccine Announcement. IEEE Access PP:1–1.
  3. 2009. Vaccines and autism: a tale of shifting hypotheses. Clinical Infectious Diseases: An Official Publication of the Infectious Diseases Society of America 48(4):456–461.
  4. 2023. ChatGPT Outperforms Crowd-Workers for Text-Annotation Tasks. arXiv:2303.15056 [cs].
  5. 2023. How Close is ChatGPT to Human Experts? Comparison Corpus, Evaluation, and Detection. arXiv:2301.07597 [cs].
  6. 2023. AnnoLLM: Making Large Language Models to Be Better Crowdsourced Annotators. arXiv:2303.16854 [cs].
  7. 2007. A taxonomy of reasoning flaws in the anti-vaccine movement. Vaccine 25(16):3146–3152.
  8. 2017. The comprehensive ‘Communicate to Vaccinate’ taxonomy of communication interventions for childhood vaccination in routine and campaign contexts. BMC Public Health 17(1):1–11. Number: 1 Publisher: BioMed Central.
  9. 2019. Crowdbreaks: Tracking Health Trends Using Public Social Media Data and Crowdsourcing. Frontiers in Public Health 7:81.
  10. 2020. COVID-19 vaccination hesitancy, misinformation and conspiracy theories on social media: A content analysis of Twitter data.
  11. 2022. Hierarchical multi-label classification of scientific documents.
  12. 2023. Interface design for crowdsourcing hierarchical multi-label text annotations. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, 1–17.
  13. 2022. Vaccinelies: A natural language resource for learning to recognize misinformation about the covid-19 and hpv vaccines. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, 6967–6975.
  14. 2023. Vax-culture: A dataset for studying vaccine discourse on twitter.
  15. 2018. Ontological function annotation of long non-coding RNAs through hierarchical multi-label classification. Bioinformatics 34(10):1750–1757.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.