Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Extracting Structured Data from Physician-Patient Conversations By Predicting Noteworthy Utterances (2007.07151v1)

Published 14 Jul 2020 in cs.LG, cs.AI, cs.CL, and stat.ML

Abstract: Despite diverse efforts to mine various modalities of medical data, the conversations between physicians and patients at the time of care remain an untapped source of insights. In this paper, we leverage this data to extract structured information that might assist physicians with post-visit documentation in electronic health records, potentially lightening the clerical burden. In this exploratory study, we describe a new dataset consisting of conversation transcripts, post-visit summaries, corresponding supporting evidence (in the transcript), and structured labels. We focus on the tasks of recognizing relevant diagnoses and abnormalities in the review of organ systems (RoS). One methodological challenge is that the conversations are long (around 1500 words), making it difficult for modern deep-learning models to use them as input. To address this challenge, we extract noteworthy utterances---parts of the conversation likely to be cited as evidence supporting some summary sentence. We find that by first filtering for (predicted) noteworthy utterances, we can significantly boost predictive performance for recognizing both diagnoses and RoS abnormalities.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Kundan Krishna (14 papers)
  2. Amy Pavel (20 papers)
  3. Benjamin Schloss (2 papers)
  4. Jeffrey P. Bigham (48 papers)
  5. Zachary C. Lipton (137 papers)
Citations (17)

Summary

We haven't generated a summary for this paper yet.