Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Multiple Sclerosis Severity Classification From Clinical Text (2010.15316v1)

Published 29 Oct 2020 in cs.CL and cs.LG

Abstract: Multiple Sclerosis (MS) is a chronic, inflammatory and degenerative neurological disease, which is monitored by a specialist using the Expanded Disability Status Scale (EDSS) and recorded in unstructured text in the form of a neurology consult note. An EDSS measurement contains an overall "EDSS" score and several functional subscores. Typically, expert knowledge is required to interpret consult notes and generate these scores. Previous approaches used limited context length Word2Vec embeddings and keyword searches to predict scores given a consult note, but often failed when scores were not explicitly stated. In this work, we present MS-BERT, the first publicly available transformer model trained on real clinical data other than MIMIC. Next, we present MSBC, a classifier that applies MS-BERT to generate embeddings and predict EDSS and functional subscores. Lastly, we explore combining MSBC with other models through the use of Snorkel to generate scores for unlabelled consult notes. MSBC achieves state-of-the-art performance on all metrics and prediction tasks and outperforms the models generated from the Snorkel ensemble. We improve Macro-F1 by 0.12 (to 0.88) for predicting EDSS and on average by 0.29 (to 0.63) for predicting functional subscores over previous Word2Vec CNN and rule-based approaches.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Alister D Costa (1 paper)
  2. Stefan Denkovski (3 papers)
  3. Michal Malyska (3 papers)
  4. Sae Young Moon (4 papers)
  5. Brandon Rufino (1 paper)
  6. Zhen Yang (160 papers)
  7. Taylor Killian (9 papers)
  8. Marzyeh Ghassemi (96 papers)
Citations (12)

Summary

We haven't generated a summary for this paper yet.