Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Improving Biomedical Entity Linking with Retrieval-enhanced Learning (2312.09806v1)

Published 15 Dec 2023 in cs.CL, cs.AI, and cs.LG

Abstract: Biomedical entity linking (BioEL) has achieved remarkable progress with the help of pre-trained LLMs. However, existing BioEL methods usually struggle to handle rare and difficult entities due to long-tailed distribution. To address this limitation, we introduce a new scheme $k$NN-BioEL, which provides a BioEL model with the ability to reference similar instances from the entire training corpus as clues for prediction, thus improving the generalization capabilities. Moreover, we design a contrastive learning objective with dynamic hard negative sampling (DHNS) that improves the quality of the retrieved neighbors during inference. Extensive experimental results show that $k$NN-BioEL outperforms state-of-the-art baselines on several datasets.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (22)
  1. Olivier Bodenreider, “The Unified Medical Language System (UMLS): integrating biomedical terminology,” Nucleic Acids Research, vol. 32, 2004.
  2. “OntoEA: Ontology-guided Entity Alignment via Joint Knowledge Graph Embedding,” in Findings of ACL-IJCNLP, Aug. 2021.
  3. “Efficient Symptom Inquiring and Diagnosis via Adaptive Alignment of Reinforcement Learning and Classification,” arXiv preprint arXiv:2112.00733, 2021.
  4. “Emerging Drug Interaction Prediction Enabled by Flow-based Graph Neural Network with Biomedical Network,” Nature Computational Science.
  5. “TaggerOne: Joint Named Entity Recognition and Normalization with Semi-Markov Models,” Bioinformatics, vol. 32, 2016.
  6. “Multi-task Character-level Attentional Networks for Medical Concept Normalization,” Neural Processing Letters, vol. 49, 2019.
  7. “Biomedical Entity Representations with Synonym Marginalization,” in Proc. ACL, 2020, pp. 3641–3650.
  8. “BERT might be Overkill: A Tiny but Effective Biomedical Entity Linker based on Residual Convolutional Neural Networks,” in Findings of EMNLP, 2021.
  9. “Self-Alignment Pretraining for Biomedical Entity Representations,” in Proc. NAACL-HLT, 2021.
  10. “Prompt Combines Paraphrase: Teaching Pre-trained Models to Understand Rare Biomedical Words,” in Proc. COLING, Oct. 2022.
  11. “A Generate-and-Rank Framework with Semantic Type Regularization for Biomedical Concept Normalization,” in Proc. ACL, 2020.
  12. “Enhancing Entity Representations with Prompt Learning for Biomedical Entity Linking,” in Proc. AAAI, 2021.
  13. “Improving Biomedical Entity Linking with Cross-Entity Interaction,” in Proc. AAAI, 2023, vol. 37.
  14. “Generative Biomedical Entity Linking via Knowledge Base-Guided Pre-training and Synonyms-Aware Fine-tuning,” in Proc. NAACL-HLT, 2022.
  15. “BioBART: Pretraining and Evaluation of a Biomedical Generative Language Model,” arXiv preprint arXiv:2204.03905, 2022.
  16. “Nearest Neighbor Machine Translation,” in International Conference on Learning Representations, 2020.
  17. “SimCSE: Simple Contrastive Learning of Sentence Embeddings,” in Proc. EMNLP, 2021.
  18. “NCBI Disease Corpus: A Resource for Disease Name Recognition and Concept Normalization,” Journal of Biomedical Informatics, vol. 47, 2014.
  19. “BioCreative V CDR task corpus: a resource for chemical disease relation extraction,” Database, 2016.
  20. “COMETA: A Corpus for Medical Entity Linking in the Social Media,” in Proc. EMNLP, 2020.
  21. “Normalising Medical Concepts in Social Media Texts by Learning Semantic Representation,” in Proc. ACL, 2016.
  22. “Clustering-based Inference for Biomedical Entity Linking,” in Proc. NAACL-HLT, 2021.
Citations (2)

Summary

We haven't generated a summary for this paper yet.