Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Knowledge-Empowered Representation Learning for Chinese Medical Reading Comprehension: Task, Model and Resources (2008.10327v2)

Published 24 Aug 2020 in cs.CL, cs.AI, and cs.LG

Abstract: Machine Reading Comprehension (MRC) aims to extract answers to questions given a passage. It has been widely studied recently, especially in open domains. However, few efforts have been made on closed-domain MRC, mainly due to the lack of large-scale training data. In this paper, we introduce a multi-target MRC task for the medical domain, whose goal is to predict answers to medical questions and the corresponding support sentences from medical information sources simultaneously, in order to ensure the high reliability of medical knowledge serving. A high-quality dataset is manually constructed for the purpose, named Multi-task Chinese Medical MRC dataset (CMedMRC), with detailed analysis conducted. We further propose the Chinese medical BERT model for the task (CMedBERT), which fuses medical knowledge into pre-trained LLMs by the dynamic fusion mechanism of heterogeneous features and the multi-task learning strategy. Experiments show that CMedBERT consistently outperforms strong baselines by fusing context-aware and knowledge-aware token representations.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Taolin Zhang (34 papers)
  2. Chengyu Wang (93 papers)
  3. Minghui Qiu (58 papers)
  4. Bite Yang (2 papers)
  5. Xiaofeng He (33 papers)
  6. Jun Huang (126 papers)
Citations (4)

Summary

We haven't generated a summary for this paper yet.