Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
11 tokens/sec
GPT-4o
12 tokens/sec
Gemini 2.5 Pro Pro
40 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
37 tokens/sec
DeepSeek R1 via Azure Pro
33 tokens/sec
2000 character limit reached

Scope of Large Language Models for Mining Emerging Opinions in Online Health Discourse (2403.03336v1)

Published 5 Mar 2024 in cs.CL and cs.SI

Abstract: In this paper, we develop an LLM-powered framework for the curation and evaluation of emerging opinion mining in online health communities. We formulate emerging opinion mining as a pairwise stance detection problem between (title, comment) pairs sourced from Reddit, where post titles contain emerging health-related claims on a topic that is not predefined. The claims are either explicitly or implicitly expressed by the user. We detail (i) a method of claim identification -- the task of identifying if a post title contains a claim and (ii) an opinion mining-driven evaluation framework for stance detection using LLMs. We facilitate our exploration by releasing a novel test dataset, Long COVID-Stance, or LC-stance, which can be used to evaluate LLMs on the tasks of claim identification and stance detection in online health communities. Long Covid is an emerging post-COVID disorder with uncertain and complex treatment guidelines, thus making it a suitable use case for our task. LC-Stance contains long COVID treatment related discourse sourced from a Reddit community. Our evaluation shows that GPT-4 significantly outperforms prior works on zero-shot stance detection. We then perform thorough LLM model diagnostics, identifying the role of claim type (i.e. implicit vs explicit claims) and comment length as sources of model error.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (58)
  1. Large language models are few-shot clinical information extractors. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 1998–2022.
  2. Fact checking: An automatic end to end fact checking system. Combating Fake News with Computational Intelligence Techniques, 345–366.
  3. Zero-shot stance detection: A dataset and model using generalized topic representations. arXiv preprint arXiv:2010.03640.
  4. User-Based Stance Analysis for Mitigating the Impact of Social Bots on Measuring Public Opinion with Stance Detection in Twitter. In International Conference on Social Informatics, 381–388. Springer.
  5. A benchmark dataset of check-worthy factual claims. In Proceedings of the International AAAI Conference on Web and Social Media, volume 14, 821–829.
  6. MultiFC: A real-world multi-domain dataset for evidence-based fact checking of claims. arXiv preprint arXiv:1909.03242.
  7. Tweeteval: Unified benchmark and comparative evaluation for tweet classification. arXiv preprint arXiv:2010.12421.
  8. Language models are few-shot learners. Advances in neural information processing systems, 33: 1877–1901.
  9. Large language models for text classification: From zero-shot learning to fine-tuning. Open Science Foundation.
  10. Social media use for health purposes: systematic review. Journal of medical Internet research, 23(5): e17917.
  11. Unmasking people’s opinions behind mask-wearing during COVID-19 pandemic—a Twitter stance analysis. Symmetry, 13(11): 1995.
  12. Use of Large Language Models for Stance Classification. arXiv preprint arXiv:2309.13734.
  13. The state of human-centered NLP technology for fact-checking. Information processing & management, 60(2): 103219.
  14. Multiple Evidence Combination for Fact-Checking of Health-Related Information. In The 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks, 237–247.
  15. Chain-of-Thought Embeddings for Stance Detection on Social Media. In Findings of the Association for Computational Linguistics: EMNLP 2023, 4154–4161.
  16. Text Encoders Lack Knowledge: Leveraging Generative LLMs for Domain-Specific Semantic Textual Similarity. arXiv preprint arXiv:2309.06541.
  17. Stance detection in COVID-19 tweets. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Long Papers), volume 1.
  18. SemEval-2019 Task 7: RumourEval 2019: Determining Rumour Veracity and Support for Rumours. In Proceedings of the 13th International Workshop on Semantic Evaluation: NAACL HLT 2019, 845–854.
  19. Domain-specific language model pretraining for biomedical natural language processing. ACM Transactions on Computing for Healthcare (HEALTH), 3(1): 1–23.
  20. A Survey on Stance Detection for Mis-and Disinformation Identification. In Findings of the Association for Computational Linguistics: NAACL 2022, 1259–1277.
  21. Claimbuster: The first-ever end-to-end fact-checking system. Proceedings of the VLDB Endowment, 10(12): 1945–1948.
  22. DEBERTA: DECODING-ENHANCED BERT WITH DISENTANGLED ATTENTION. In International Conference on Learning Representations.
  23. COVIDLies: Detecting COVID-19 misinformation on social media. In Proceedings of the 1st Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020.
  24. UnifEE: Unified Evidence Extraction for Fact Verification. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, 1142–1152.
  25. Is ChatGPT better than Human Annotators? Potential and Limitations of ChatGPT in Explaining Implicit Hate Speech. In Companion Proceedings of the ACM Web Conference 2023, 294–297.
  26. Application of data analytics for product design: Sentiment analysis of online product reviews. CIRP Journal of Manufacturing Science and Technology, 23: 128–144.
  27. Balanced and explainable social media analysis for public health with large language models. In Australasian Database Conference, 73–86. Springer.
  28. PoxVerifi: An Information Verification System to Combat Monkeypox Misinformation. arXiv preprint arXiv:2209.09300.
  29. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 7871–7880.
  30. An end-to-end multi-task learning model for fact checking. In Proceedings of the First Workshop on Fact Extraction and VERification (FEVER), 138–144.
  31. P-stance: A large dataset for stance detection in political domain. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, 2355–2365.
  32. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692.
  33. Sentiment analysis algorithms and applications: A survey. Ain Shams engineering journal, 5(4): 1093–1113.
  34. Semeval-2016 task 6: Detecting stance in tweets. In Proceedings of the 10th international workshop on semantic evaluation (SemEval-2016), 31–41.
  35. Language-aware truth assessment of fact candidates. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 1009–1019.
  36. Overview of the CLEF-2022 CheckThat! lab task 1 on identifying relevant claims in tweets. In 2022 Conference and Labs of the Evaluation Forum, CLEF 2022, 368–392. CEUR Workshop Proceedings (CEUR-WS. org).
  37. Tathya: A multi-classifier system for detecting check-worthy statements in political debates. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, 2259–2262.
  38. DeClarE: Debunking Fake News and False Claims using Evidence-Aware Deep Learning. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 22–32.
  39. Read, Diagnose and Chat: Towards Explainable and Interactive LLMs-Augmented Depression Detection in Social Media. arXiv preprint arXiv:2305.05138.
  40. Lessons from Natural Language Inference in the Clinical Domain. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 1586–1596.
  41. Claim Extraction and Dynamic Stance Detection in COVID-19 Tweets. In Companion Proceedings of the ACM Web Conference 2023, 1059–1068.
  42. Characterizing Information Seeking Events in Health-Related Social Discourse. arXiv preprint arXiv:2308.09156.
  43. Fakenewsnet: A data repository with news content, social context, and spatiotemporal information for studying fake news on social media. Big data, 8(3): 171–188.
  44. Bert for evidence retrieval and claim verification. In Advances in Information Retrieval: 42nd European Conference on IR Research, ECIR 2020, Lisbon, Portugal, April 14–17, 2020, Proceedings, Part II 42, 359–366. Springer.
  45. Beyond the imitation game: Quantifying and extrapolating the capabilities of language models. arXiv preprint arXiv:2206.04615.
  46. Challenging big-bench tasks and whether chain-of-thought can solve them. arXiv preprint arXiv:2210.09261.
  47. Leveraging Large Language Models and Weak Supervision for Social Media data annotation: an evaluation using COVID-19 self-reported vaccination tweets. In International Conference on Human-Computer Interaction, 356–366. Springer.
  48. FEVER: a Large-scale Dataset for Fact Extraction and VERification. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), 809–819.
  49. Early detection of rumours on twitter via stance transfer learning. In Advances in Information Retrieval: 42nd European Conference on IR Research, ECIR 2020, Lisbon, Portugal, April 14–17, 2020, Proceedings, Part I 42, 575–588. Springer.
  50. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288.
  51. Identification and verification of simple claims about statistical properties. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 2596–2601.
  52. Document-Level Machine Translation with Large Language Models. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 16646–16661.
  53. Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems, 35: 24824–24837.
  54. Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771.
  55. Mental-LLM: Leveraging Large Language Models for Mental Health Prediction via Online Text Data. arXiv:2307.14385.
  56. Yong, S. J. 2021. Long COVID or post-COVID-19 syndrome: putative pathophysiology, risk factors, and treatments. Infectious diseases, 53(10): 737–754.
  57. EZ-STANCE: A Large Dataset for Zero-Shot Stance Detection. In Findings of the Association for Computational Linguistics: EMNLP 2023, 897–911.
  58. Can chatgpt reproduce human-generated labels? a study of social computing tasks. arXiv preprint arXiv:2304.10145.
Citations (1)

Summary

We haven't generated a summary for this paper yet.