Pre-trained Language Model for Biomedical Question Answering (1909.08229v1)
Abstract: The recent success of question answering systems is largely attributed to pre-trained LLMs. However, as LLMs are mostly pre-trained on general domain corpora such as Wikipedia, they often have difficulty in understanding biomedical questions. In this paper, we investigate the performance of BioBERT, a pre-trained biomedical LLM, in answering biomedical questions including factoid, list, and yes/no type questions. BioBERT uses almost the same structure across various question types and achieved the best performance in the 7th BioASQ Challenge (Task 7b, Phase B). BioBERT pre-trained on SQuAD or SQuAD 2.0 easily outperformed previous state-of-the-art models. BioBERT obtains the best performance when it uses the appropriate pre-/post-processing strategies for questions, passages, and answers.
- Wonjin Yoon (13 papers)
- Jinhyuk Lee (27 papers)
- Donghyeon Kim (26 papers)
- Minbyul Jeong (18 papers)
- Jaewoo Kang (83 papers)