Comparing and combining some popular NER approaches on Biomedical tasks (2305.19120v1)

Published 30 May 2023 in cs.CL and cs.LG

Abstract: We compare three simple and popular approaches for NER: 1) SEQ (sequence-labeling with a linear token classifier) 2) SeqCRF (sequence-labeling with Conditional Random Fields), and 3) SpanPred (span-prediction with boundary token embeddings). We compare the approaches on 4 biomedical NER tasks: GENIA, NCBI-Disease, LivingNER (Spanish), and SocialDisNER (Spanish). The SpanPred model demonstrates state-of-the-art performance on LivingNER and SocialDisNER, improving F1 by 1.3 and 0.6 F1 respectively. The SeqCRF model also demonstrates state-of-the-art performance on LivingNER and SocialDisNER, improving F1 by 0.2 F1 and 0.7 respectively. The SEQ model is competitive with the state-of-the-art on the LivingNER dataset. We explore some simple ways of combining the three approaches. We find that majority voting consistently gives high precision and high F1 across all 4 datasets. Lastly, we implement a system that learns to combine the predictions of SEQ and SpanPred, generating systems that consistently give high recall and high F1 across all 4 datasets. On the GENIA dataset, we find that our learned combiner system significantly boosts F1(+1.2) and recall(+2.1) over the systems being combined. We release all the well-documented code necessary to reproduce all systems at https://github.com/flyingmothman/bionlp.

Citations (2)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

GitHub

GitHub - flyingmothman/bionlp: The models and framework used in the BioNLP 2023 paper titled "Comparing and combining some popular NER approaches on Biomedical tasks" can be found here ! (3 stars)

Comparing and combining some popular NER approaches on Biomedical tasks (2305.19120v1)

Summary

Related Papers

GitHub