Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Leveraging ChatGPT in Pharmacovigilance Event Extraction: An Empirical Study (2402.15663v1)

Published 24 Feb 2024 in cs.CL

Abstract: With the advent of LLMs, there has been growing interest in exploring their potential for medical applications. This research aims to investigate the ability of LLMs, specifically ChatGPT, in the context of pharmacovigilance event extraction, of which the main goal is to identify and extract adverse events or potential therapeutic events from textual medical sources. We conduct extensive experiments to assess the performance of ChatGPT in the pharmacovigilance event extraction task, employing various prompts and demonstration selection strategies. The findings demonstrate that while ChatGPT demonstrates reasonable performance with appropriate demonstration selection strategies, it still falls short compared to fully fine-tuned small models. Additionally, we explore the potential of leveraging ChatGPT for data augmentation. However, our investigation reveals that the inclusion of synthesized data into fine-tuning may lead to a decrease in performance, possibly attributed to noise in the ChatGPT-generated labels. To mitigate this, we explore different filtering strategies and find that, with the proper approach, more stable performance can be achieved, although constant improvement remains elusive.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (21)
  1. Large language models are few-shot clinical information extractors. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 1998–2022.
  2. Scaling instruction-finetuned language models. arXiv preprint arXiv:2210.11416.
  3. Mttlade: A multi-task transfer learning-based method for adverse drug events extraction. Information Processing & Management, 58(3):102473.
  4. Development of a benchmark corpus to support the automatic extraction of drug-related adverse effects from medical case reports. Journal of biomedical informatics, 45(5):885–892.
  5. Performance of chatgpt on usmle: Potential for ai-assisted medical education using large language models. PLOS Digital Health, 2:e0000198.
  6. Few-shot learning with multilingual generative language models. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 9019–9052.
  7. Wanli: Worker and ai collaboration for natural language inference dataset creation. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 6826–6847.
  8. NapSS: Paragraph-level medical text simplification via narrative prompting and sentence-matching summarization. In Findings of the Association for Computational Linguistics: EACL 2023, pages 1079–1091, Dubrovnik, Croatia. Association for Computational Linguistics.
  9. Unified structure generation for universal information extraction. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 5755–5772, Dublin, Ireland. Association for Computational Linguistics.
  10. OpenAI. 2022. Chatgpt.
  11. Boosting low-resource biomedical QA via entity-aware masking strategies. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pages 1977–1985, Online. Association for Computational Linguistics.
  12. Nils Reimers and Iryna Gurevych. 2019. Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084.
  13. PHEE: A dataset for pharmacovigilance event extraction from text. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 5571–5587, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
  14. Event temporal relation extraction with Bayesian translational model. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, pages 1125–1138, Dubrovnik, Croatia. Association for Computational Linguistics.
  15. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971.
  16. Improvements to bm25 and language models examined. In Proceedings of the 2014 Australasian Document Computing Symposium, pages 58–65.
  17. Llm-powered data augmentation for enhanced crosslingual performance. arXiv preprint arXiv:2305.14288.
  18. World Health Organization. 2004. Pharmacovigilance: ensuring the safe use of medicines. Technical report, World Health Organization.
  19. Towards transforming fda adverse event narratives into actionable structured data for improved pharmacovigilance. In Proceedings of the Symposium on Applied Computing, pages 777–782.
  20. Disentangled learning of stance and aspect topics for vaccine attitude detection in social media. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1566–1580, Seattle, United States. Association for Computational Linguistics.
  21. Disentangling aspect and stance via a Siamese autoencoder for aspect clustering of vaccination opinions. In Findings of the Association for Computational Linguistics: ACL 2023, pages 1827–1842, Toronto, Canada. Association for Computational Linguistics.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Zhaoyue Sun (6 papers)
  2. Gabriele Pergola (26 papers)
  3. Byron C. Wallace (82 papers)
  4. Yulan He (113 papers)
Citations (10)