Knowledge-Empowered Representation Learning for Chinese Medical Reading Comprehension: Task, Model and Resources (2008.10327v2)

Published 24 Aug 2020 in cs.CL, cs.AI, and cs.LG

Abstract: Machine Reading Comprehension (MRC) aims to extract answers to questions given a passage. It has been widely studied recently, especially in open domains. However, few efforts have been made on closed-domain MRC, mainly due to the lack of large-scale training data. In this paper, we introduce a multi-target MRC task for the medical domain, whose goal is to predict answers to medical questions and the corresponding support sentences from medical information sources simultaneously, in order to ensure the high reliability of medical knowledge serving. A high-quality dataset is manually constructed for the purpose, named Multi-task Chinese Medical MRC dataset (CMedMRC), with detailed analysis conducted. We further propose the Chinese medical BERT model for the task (CMedBERT), which fuses medical knowledge into pre-trained LLMs by the dynamic fusion mechanism of heterogeneous features and the multi-task learning strategy. Experiments show that CMedBERT consistently outperforms strong baselines by fusing context-aware and knowledge-aware token representations.

Authors (6)

Taolin Zhang (34 papers)
Chengyu Wang (93 papers)
Minghui Qiu (58 papers)
Bite Yang (2 papers)
Xiaofeng He (33 papers)
Jun Huang (126 papers)

Citations (4)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Knowledge-Empowered Representation Learning for Chinese Medical Reading Comprehension: Task, Model and Resources (2008.10327v2)

Summary

Related Papers