ClinicalBERT: Modeling Clinical Notes and Predicting Hospital Readmission (1904.05342v3)

Published 10 Apr 2019 in cs.CL and cs.LG

Abstract: Clinical notes contain information about patients that goes beyond structured data like lab values and medications. However, clinical notes have been underused relative to structured data, because notes are high-dimensional and sparse. This work develops and evaluates representations of clinical notes using bidirectional transformers (ClinicalBERT). ClinicalBERT uncovers high-quality relationships between medical concepts as judged by humans. ClinicalBert outperforms baselines on 30-day hospital readmission prediction using both discharge summaries and the first few days of notes in the intensive care unit. Code and model parameters are available.

PDF Abstract

ClinicalBERT: Modeling Clinical Notes and Predicting Hospital Readmission

Introduction

Clinical notes are a crucial part of Electronic Health Records (EHRs), containing valuable information that can aid in clinical decision-making. However, the high-dimensional, unstructured, and sparse nature of clinical notes makes them challenging to incorporate into predictive models. This paper addresses these challenges by proposing ClinicalBERT, a specialized version of BERT pre-trained on clinical text, designed to model clinical notes for predicting 30-day hospital readmission.

Methodology

ClinicalBERT adapts the BERT architecture by pre-training on clinical notes from the MIMIC-III dataset. The BERT model involves two unsupervised tasks: masked LLMing and next sentence prediction to capture the high-dimensional relationships within clinical notes. The pre-trained ClinicalBERT is then fine-tuned to predict hospital readmission, dynamically updating risk scores as new notes are added.

Pre-training and Fine-tuning

The pre-trained model utilizes clinical text to build continuous representations of clinical notes. These representations are refined in a specific task—30-day hospital readmission prediction. The fine-tuning task involves calculating a binary classification using a linear transformation of the [CLS] token, which represents the entire input sequence.

Token Representation

ClinicalBERT incorporates subword tokens, segment embeddings, and position embeddings. These components allow the model to manage lengthy and complex clinical notes effectively. This enables the model to capture interactions between distant tokens, providing a richer understanding of the context within clinical notes.

Performance Evaluation

LLMing and Clinical Word Similarity

ClinicalBERT significantly outperforms the baseline BERT in masked LLMing and next sentence prediction. It also demonstrates a higher Pearson correlation with physician-rated similarity scores of medical terms versus other embedding models like Word2Vec and FastText.

Readmission Prediction

Two key experiments are conducted to evaluate ClinicalBERT’s ability to predict 30-day hospital readmission:

Using Discharge Summaries: ClinicalBERT shows superior performance compared to the bag-of-words model, Bi-LSTM with Word2Vec embeddings, and BERT. ClinicalBERT achieves an AUROC of 0.714, showcasing its efficacy in utilizing discharge summaries for predictions.
Using Early Clinical Notes: The model's performance is tested with clinical notes from the first 48 or 72 hours of patient admission. ClinicalBERT outperforms the benchmarks, indicating that early notes can effectively predict readmissions, facilitating timely interventions.

Interpretability

Attention mechanisms in ClinicalBERT allow for interpretability of the model’s predictions. By visualizing attention weights, it is possible to identify predictive terms and phrases within clinical notes. This feature is crucial as it can help clinicians understand why a particular prediction was made, thereby increasing trust in the model.

Practical Implications

The successful application of ClinicalBERT in predicting hospital readmissions has several practical implications. It can be adapted to various clinical tasks such as mortality prediction, disease prediction, or length-of-stay estimation. By training ClinicalBERT on institution-specific EHR data, hospitals can achieve more accurate and tailored predictive models, enhancing the overall efficacy of healthcare delivery.

Conclusion and Future Work

ClinicalBERT offers a robust framework for extracting meaningful representations from clinical notes, outperforming existing models in readmission prediction. Future work includes scaling ClinicalBERT for long clinical notes to capture more nuanced dependencies. Given the vast amount of clinical text available in hospitals, training ClinicalBERT on larger datasets promises further improvements.

Acknowledgements

The authors thank Noémie Elhadad for her invaluable insights. This work leverages the publicly available ClinicalBERT model parameters and associated scripts, facilitating further research and application in different clinical tasks.

PDF Markdown Bookmark Chat (Pro)

Authors (3)

Kexin Huang (50 papers)
Jaan Altosaar (9 papers)
Rajesh Ranganath (76 papers)

Citations (790)

View on Semantic Scholar

Related Papers

Find Related Papers

Tweets

https://twitter.com/thejaan/status/1798711126299177178

YouTube

Show All Videos