Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Clustering-based Inference for Biomedical Entity Linking (2010.11253v2)

Published 21 Oct 2020 in cs.CL

Abstract: Due to large number of entities in biomedical knowledge bases, only a small fraction of entities have corresponding labelled training data. This necessitates entity linking models which are able to link mentions of unseen entities using learned representations of entities. Previous approaches link each mention independently, ignoring the relationships within and across documents between the entity mentions. These relations can be very useful for linking mentions in biomedical text where linking decisions are often difficult due mentions having a generic or a highly specialized form. In this paper, we introduce a model in which linking decisions can be made not merely by linking to a knowledge base entity but also by grouping multiple mentions together via clustering and jointly making linking predictions. In experiments on the largest publicly available biomedical dataset, we improve the best independent prediction for entity linking by 3.0 points of accuracy, and our clustering-based inference model further improves entity linking by 2.3 points.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Rico Angell (12 papers)
  2. Nicholas Monath (29 papers)
  3. Sunil Mohan (7 papers)
  4. Nishant Yadav (15 papers)
  5. Andrew McCallum (132 papers)
Citations (4)