2000 character limit reached
Automatic Extraction of Medication Names in Tweets as Named Entity Recognition (2111.15641v1)
Published 30 Nov 2021 in cs.CL
Abstract: Social media posts contain potentially valuable information about medical conditions and health-related behavior. Biocreative VII Task 3 focuses on mining this information by recognizing mentions of medications and dietary supplements in tweets. We approach this task by fine tuning multiple BERT-style LLMs to perform token-level classification, and combining them into ensembles to generate final predictions. Our best system consists of five Megatron-BERT-345M models and achieves a strict F1 score of 0.764 on unseen test data.