Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Can Language Models be Biomedical Knowledge Bases? (2109.07154v1)

Published 15 Sep 2021 in cs.CL

Abstract: Pre-trained LLMs (LMs) have become ubiquitous in solving various NLP tasks. There has been increasing interest in what knowledge these LMs contain and how we can extract that knowledge, treating LMs as knowledge bases (KBs). While there has been much work on probing LMs in the general domain, there has been little attention to whether these powerful LMs can be used as domain-specific KBs. To this end, we create the BioLAMA benchmark, which is comprised of 49K biomedical factual knowledge triples for probing biomedical LMs. We find that biomedical LMs with recently proposed probing methods can achieve up to 18.51% Acc@5 on retrieving biomedical knowledge. Although this seems promising given the task difficulty, our detailed analyses reveal that most predictions are highly correlated with prompt templates without any subjects, hence producing similar results on each relation and hindering their capabilities to be used as domain-specific KBs. We hope that BioLAMA can serve as a challenging benchmark for biomedical factual probing.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Mujeen Sung (20 papers)
  2. Jinhyuk Lee (27 papers)
  3. Sean Yi (1 paper)
  4. Minji Jeon (3 papers)
  5. Sungdong Kim (30 papers)
  6. Jaewoo Kang (83 papers)
Citations (99)
Github Logo Streamline Icon: https://streamlinehq.com