Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Annotating Electronic Medical Records for Question Answering (1805.06816v1)

Published 17 May 2018 in cs.CL and cs.CY

Abstract: Our research is in the relatively unexplored area of question answering technologies for patient-specific questions over their electronic health records. A large dataset of human expert curated question and answer pairs is an important pre-requisite for developing, training and evaluating any question answering system that is powered by machine learning. In this paper, we describe a process for creating such a dataset of questions and answers. Our methodology is replicable, can be conducted by medical students as annotators, and results in high inter-annotator agreement (0.71 Cohen's kappa). Over the course of 11 months, 11 medical students followed our annotation methodology, resulting in a question answering dataset of 5696 questions over 71 patient records, of which 1747 questions have corresponding answers generated by the medical students.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Preethi Raghavan (14 papers)
  2. Siddharth Patwardhan (15 papers)
  3. Jennifer J. Liang (4 papers)
  4. Murthy V. Devarakonda (2 papers)
Citations (17)