Efficient Medical Question Answering with Knowledge-Augmented Question Generation (2405.14654v1)
Abstract: In the expanding field of LLM applications, medical knowledge representation remains a significant challenge due to the specialized nature of the domain. LLMs, such as GPT-4, obtain reasonable scores on medical question answering tasks, but smaller models are far behind. In this work, we introduce a method to improve the proficiency of a small LLM in the medical domain by employing a two-fold approach. We first fine-tune the model on a corpus of medical textbooks. Then, we use GPT-4 to generate questions similar to the downstream task, prompted with textbook knowledge, and use them to fine-tune the model. Additionally, we introduce ECN-QA, a novel medical question answering dataset containing ``progressive questions'' composed of related sequential questions. We show the benefits of our training strategy on this dataset. The study's findings highlight the potential of small LLMs in the medical domain when appropriately fine-tuned. The code and weights are available at https://github.com/raidium-med/MQG.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.