Named Entity Recognition Under Domain Shift via Metric Learning for Life Sciences (2401.10472v2)
Abstract: Named entity recognition is a key component of Information Extraction (IE), particularly in scientific domains such as biomedicine and chemistry, where LLMs, e.g., ChatGPT, fall short. We investigate the applicability of transfer learning for enhancing a named entity recognition model trained in the biomedical domain (the source domain) to be used in the chemical domain (the target domain). A common practice for training such a model in a few-shot learning setting is to pretrain the model on the labeled source data, and then, to finetune it on a hand-full of labeled target examples. In our experiments, we observed that such a model is prone to mislabeling the source entities, which can often appear in the text, as the target entities. To alleviate this problem, we propose a model to transfer the knowledge from the source domain to the target domain, but, at the same time, to project the source entities and target entities into separate regions of the feature space. This diminishes the risk of mislabeling the source entities as the target entities. Our model consists of two stages: 1) entity grouping in the source domain, which incorporates knowledge from annotated events to establish relations between entities, and 2) entity discrimination in the target domain, which relies on pseudo labeling and contrastive learning to enhance discrimination between the entities in the two domains. We conduct our extensive experiments across three source and three target datasets, demonstrating that our method outperforms the baselines by up to 5% absolute value.
- Jatin Arora and Youngja Park. 2023. Split-NER: Named entity recognition via two question-answering-based classifications. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 416–426, Toronto, Canada. Association for Computational Linguistics.
- Biomedical named entity recognition via knowledge guidance and question answering. ACM Trans. Comput. Healthcare, 2(4).
- A theory of learning from different domains. Machine learning, 79(1-2):151–175.
- Gaussian distributed prototypical network for few-shot genomic variant detection. In The 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks, pages 26–36, Toronto, Canada. Association for Computational Linguistics.
- Adversarial transfer learning for Chinese named entity recognition with self-attention mechanism. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 182–192, Brussels, Belgium. Association for Computational Linguistics.
- Learning in-context learning for named entity recognition. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 13661–13675, Toronto, Canada. Association for Computational Linguistics.
- Style transfer as data augmentation: A case study on named entity recognition. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 1827–1841, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Caio Corro. 2023. A dynamic programming algorithm for span-based nested named-entity recognition in o(n2)𝑜superscript𝑛2o(n^{2})italic_o ( italic_n start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ). In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 10712–10724, Toronto, Canada. Association for Computational Linguistics.
- David L. Davies and Donald W. Bouldin. 1979. A cluster separation measure. IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-1(2):224–227.
- Improved methods to aid unsupervised evidence-based fact checking for online health news. Journal of Data Intelligence, 3(4):474–504.
- Results of the WNUT2017 shared task on novel and emerging entity recognition. In Proceedings of the 3rd Workshop on Noisy User-generated Text, pages 140–147, Copenhagen, Denmark. Association for Computational Linguistics.
- BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota. Association for Computational Linguistics.
- Don’t stop pretraining: Adapt language models to domains and tasks. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 8342–8360, Online. Association for Computational Linguistics.
- Can synthetic text help clinical named entity recognition? a study of electronic health records in French. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, pages 2320–2338, Dubrovnik, Croatia. Association for Computational Linguistics.
- Parameter-efficient transfer learning for nlp. Computation and Language Repository, arXiv:1902.00751.
- Large language models struggle to learn long-tail knowledge. In International Conference on Machine Learning, ICML 2023, 23-29 July 2023, Honolulu, Hawaii, USA, volume 202 of Proceedings of Machine Learning Research, pages 15696–15707. PMLR.
- Biomedical NER using novel schema and distant supervision. In Proceedings of the 21st Workshop on Biomedical Language Processing, pages 155–160, Dublin, Ireland. Association for Computational Linguistics.
- Overview of biocreative v bioc track. In Proceedings of the Fifth BioCreative Challenge Evaluation Workshop, Sevilla, Spain, pages 1–9.
- Veysel Kocaman and David Talby. 2021. Biomedical named entity recognition at scale. In Pattern Recognition. ICPR International Workshops and Challenges, pages 635–646, Cham. Springer International Publishing.
- The chemdner corpus of chemicals and drugs and its annotation principles. Journal of cheminformatics, 7(1):1–17.
- DrBERT: A robust pre-trained model in French for biomedical and clinical domains. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 16207–16221, Toronto, Canada. Association for Computational Linguistics.
- A systematic study and comprehensive evaluation of ChatGPT on benchmark datasets. In Findings of the Association for Computational Linguistics: ACL 2023, pages 431–469, Toronto, Canada. Association for Computational Linguistics.
- AutoTriggER: Label-efficient and robust named entity recognition with auxiliary trigger extraction. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, pages 3011–3025, Dubrovnik, Croatia. Association for Computational Linguistics.
- Transfer learning for named-entity recognition with neural networks. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan. European Language Resources Association (ELRA).
- A joint neural model for information extraction with global features. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7999–8009, Online. Association for Computational Linguistics.
- Self-alignment pretraining for biomedical entity representations. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 4228–4238, Online. Association for Computational Linguistics.
- Crossner: Evaluating cross-domain named entity recognition. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 13452–13460.
- BioRED: a rich biomedical relation extraction dataset. Briefings in Bioinformatics, 23(5):bbac282.
- CoLaDa: A collaborative label denoising framework for cross-lingual named entity recognition. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 5995–6009, Toronto, Canada. Association for Computational Linguistics.
- Entity extraction in low resource domains with selective pre-training of large language models. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 942–951, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Overview of drugprot biocreative vii track: quality evaluation and large scale text mining of drug-gene/protein relations. In Proceedings of the seventh BioCreative challenge evaluation workshop, pages 11–21.
- Hiroki Nakayama. 2018. seqeval: A python framework for sequence labeling evaluation. Software available from https://github.com/chakki-works/seqeval.
- Overview of BioNLP shared task 2013. In Proceedings of the BioNLP Shared Task 2013 Workshop, pages 1–7, Sofia, Bulgaria. Association for Computational Linguistics.
- Hardness-guided domain adaptation to recognise biomedical named entities under low-resource scenarios. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 4063–4071, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Span-based named entity recognition by generating and compressing information. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, pages 1984–1996, Dubrovnik, Croatia. Association for Computational Linguistics.
- OpenAI. 2022. Introducing chatgpt.
- Sinno Jialin Pan and Qiang Yang. 2010. A survey on transfer learning. IEEE Trans. Knowl. Data Eng., 22(10):1345–1359.
- Transfer learning in biomedical natural language processing: An evaluation of BERT and ELMo on ten benchmarking datasets. In Proceedings of the 18th BioNLP Workshop and Shared Task, pages 58–65, Florence, Italy. Association for Computational Linguistics.
- AdapterHub: A framework for adapting transformers. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 46–54, Online. Association for Computational Linguistics.
- Scifive: a text-to-text transformer model for biomedical literature. Computation and Language Repository, arXiv:2106.03598.
- Overview of the infectious diseases (ID) task of BioNLP shared task 2011. In Proceedings of BioNLP Shared Task 2011 Workshop, pages 26–35, Portland, Oregon, USA. Association for Computational Linguistics.
- Semi-supervised domain adaptation via minimax entropy. In 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), October 27 - November 2, 2019, pages 8049–8057. IEEE.
- DiffusionNER: Boundary diffusion for named entity recognition. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 3875–3890, Toronto, Canada. Association for Computational Linguistics.
- PromptNER: Prompt locating and typing for named entity recognition. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 12492–12507, Toronto, Canada. Association for Computational Linguistics.
- DeepEventMine: end-to-end neural nested event extraction from biomedical texts. Bioinformatics, 36(19):4910–4917.
- Laurens van der Maaten and Geoffrey E. Hinton. 2008. Visualizing data using t-sne. Journal of Machine Learning Research, 9:2579–2605.
- Automated concatenation of embeddings for structured prediction. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 2643–2660, Online. Association for Computational Linguistics.
- ChemNER: Fine-grained chemistry named entity recognition with ontology-guided distant supervision. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 5227–5240, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Multi-similarity loss with general pair weighting for deep metric learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 5022–5030.
- Super-NaturalInstructions: Generalization via declarative instructions on 1600+ NLP tasks. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 5085–5109, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Optimizing bi-encoder for named entity recognition via contrastive learning. In The Eleventh International Conference on Learning Representations.
- Overcoming negative transfer: A survey. ArXiv, abs/2009.00909.
- Fine-grained information extraction from biomedical literature based on knowledge-enriched Abstract Meaning Representation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 6261–6270, Online. Association for Computational Linguistics.
- Improving self-training for cross-lingual named entity recognition with contrastive and prototype learning. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 4018–4031, Toronto, Canada. Association for Computational Linguistics.