Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Biomedical Entity Linking as Multiple Choice Question Answering (2402.15189v2)

Published 23 Feb 2024 in cs.CL, cs.AI, and cs.LG

Abstract: Although biomedical entity linking (BioEL) has made significant progress with pre-trained LLMs, challenges still exist for fine-grained and long-tailed entities. To address these challenges, we present BioELQA, a novel model that treats Biomedical Entity Linking as Multiple Choice Question Answering. BioELQA first obtains candidate entities with a fast retriever, jointly presents the mention and candidate entities to a generator, and then outputs the predicted symbol associated with its chosen entity. This formulation enables explicit comparison of different candidate entities, thus capturing fine-grained interactions between mentions and entities, as well as among entities themselves. To improve generalization for long-tailed entities, we retrieve similar labeled training instances as clues and concatenate the input with retrieved instances for the generator. Extensive experimental results show that BioELQA outperforms state-of-the-art baselines on several datasets.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (35)
  1. Clustering-based Inference for Biomedical Entity Linking. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2598–2608.
  2. PaLM 2 technical report. arXiv preprint arXiv:2305.10403.
  3. COMETA: A Corpus for Medical Entity Linking in the Social Media. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 3122–3137.
  4. Olivier Bodenreider. 2004. The Unified Medical Language System (UMLS): integrating biomedical terminology. Nucleic Acids Research, 32.
  5. Language Models are Few-shot Learners. Advances in Neural Information Processing Systems, 33:1877–1901.
  6. NCBI Disease Corpus: A Resource for Disease Name Recognition and Concept Normalization. Journal of Biomedical Informatics, 47.
  7. SimCSE: Simple Contrastive Learning of Sentence Embeddings. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.
  8. A Comprehensive Evaluation of Large Language Models on Benchmark Biomedical Text Processing Tasks. arXiv preprint arXiv:2310.04270.
  9. Billion-Scale Similarity Search with GPUs. IEEE Transactions on Big Data, 7(3).
  10. Nearest Neighbor Machine Translation. In International Conference on Learning Representations.
  11. Byung-Hak Kim and Varun Ganapathi. 2021. Read, Attend, and Code: Pushing the Limits of Medical Codes Prediction from Clinical Notes by Machines. In Machine Learning for Healthcare Conference, pages 196–208. PMLR.
  12. BERT might be Overkill: A Tiny but Effective Biomedical Entity Linker based on Residual Convolutional Neural Networks. In Findings of the Association for Computational Linguistics: EMNLP.
  13. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7871–7880.
  14. BioCreative V CDR task corpus: a resource for chemical disease relation extraction. Database.
  15. REST: Drug-Drug Interaction Prediction via Reinforced Student-Teacher Curriculum Learning. In Proceedings of the 32nd ACM International Conference on Information and Knowledge Management, CIKM ’23, page 1278–1287. Association for Computing Machinery.
  16. Multi-modal Contrastive Representation Learning for Entity Alignment. In Proceedings of the 29th International Conference on Computational Linguistics, pages 2572–2584.
  17. Improving Biomedical Entity Linking with Retrieval-enhanced Learning. In ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
  18. Self-Alignment Pretraining for Biomedical Entity Representations. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.
  19. Pouya Pezeshkpour and Estevam Hruschka. 2023. Large Language Models Sensitivity to The Order of Options in Multiple-Choice Questions. arXiv preprint arXiv:2308.11483.
  20. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. The Journal of Machine Learning Research, 21(1):5485–5551.
  21. Leveraging Large Language Models for Multiple Choice Question Answering. arXiv preprint arXiv:2210.12353.
  22. Understanding Patient Query With Weak Supervision From Doctor Response. IEEE Journal of Biomedical and Health Informatics, 26(6):2770–2777.
  23. Biomedical Entity Representations with Synonym Marginalization. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 3641–3650.
  24. Sequence to Sequence Learning with Neural Networks. Advances in Neural Information Processing Systems, 27.
  25. Llama 2: Open Foundation and Fine-tuned Chat Models. arXiv preprint arXiv:2307.09288.
  26. Cross-Domain Data Integration for Named Entity Disambiguation in Biomedical Text. In Findings of the Association for Computational Linguistics: EMNLP 2021, pages 4566–4575.
  27. Scalable Zero-shot Entity Linking with Dense Entity Retrieval. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, pages 6397–6407.
  28. OntoEA: Ontology-guided Entity Alignment via Joint Knowledge Graph Embedding. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 1117–1128.
  29. Improving Biomedical Entity Linking with Cross-Entity Interaction. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37.
  30. A Knowledge-driven Generative Model for Multi-implication Chinese Medical Procedure Entity Normalization. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, pages 1490–1499.
  31. Hongyi Yuan and Sheng Yu. 2021. Efficient Symptom Inquiring and Diagnosis via Adaptive Alignment of Reinforcement Learning and Classification. arXiv preprint arXiv:2112.00733.
  32. BioBART: Pretraining and Evaluation of a Biomedical Generative Language Model. arXiv preprint arXiv:2204.03905.
  33. Generative Biomedical Entity Linking via Knowledge Base-Guided Pre-training and Synonyms-Aware Fine-tuning. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.
  34. Emerging Drug Interaction Prediction Enabled by Flow-based Graph Neural Network with Biomedical Network. Nature Computational Science, 3(12):1023–1033.
  35. Enhancing Entity Representations with Prompt Learning for Biomedical Entity Linking. In Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets