Explainable Prediction of Medical Codes from Clinical Text (1802.05695v2)

Published 15 Feb 2018 in cs.CL, cs.LG, and stat.ML

Abstract: Clinical notes are text documents that are created by clinicians for each patient encounter. They are typically accompanied by medical codes, which describe the diagnosis and treatment. Annotating these codes is labor intensive and error prone; furthermore, the connection between the codes and the text is not annotated, obscuring the reasons and details behind specific diagnoses and treatments. We present an attentional convolutional network that predicts medical codes from clinical text. Our method aggregates information across the document using a convolutional neural network, and uses an attention mechanism to select the most relevant segments for each of the thousands of possible codes. The method is accurate, achieving precision@8 of 0.71 and a Micro-F1 of 0.54, which are both better than the prior state of the art. Furthermore, through an interpretability evaluation by a physician, we show that the attention mechanism identifies meaningful explanations for each code assignment

Authors (5)

James Mullenbach (3 papers)
Sarah Wiegreffe (20 papers)
Jon Duke (4 papers)
Jimeng Sun (181 papers)
Jacob Eisenstein (73 papers)

Citations (542)

View on Semantic Scholar

Summary

Explainable Prediction of Medical Codes from Clinical Text

The paper "Explainable Prediction of Medical Codes from Clinical Text" by Mullenbach et al. addresses the challenging task of predicting ICD codes from clinical narratives. This task is arduous due to the labor-intensive nature of manual coding and the high-dimensional label space, comprising thousands of codes. The authors propose a novel attentional convolutional network, Convolutional Attention for Multi-Label classification (CAML), which demonstrates significant improvements over previous methods.

Methodology

The authors formulate ICD-9 code prediction as a multilabel text classification problem, leveraging convolutional neural networks (CNNs) with an attention mechanism. The attention component is crucial as it identifies relevant text snippets for each potential code, thus enhancing interpretability—a key requirement in the medical domain. This per-label attention mechanism distinguishes CAML, enabling it to create differentiated document representations for each label.

CAML employs a convolutional architecture where word embeddings are passed through convolutional filters, followed by an attention mechanism that focuses on pivotal segments of the text for each label. This method contrasts with traditional max-pooling techniques that aggregate features indiscriminately. Additionally, a regularizer is used to align similar codes with alike textual descriptions, enhancing the system’s performance on rarely observed labels.

Evaluation and Results

The evaluation is conducted using the MIMIC datasets, containing medical records from ICU stays. CAML outperforms existing state-of-the-art methods, achieving a precision $@8$ of 0.71 and a Micro-F1 of 0.54 on the full-label MIMIC-III dataset. These metrics are critical as they reflect the model’s ability to manage a high-number of descriptive codes efficiently.

For comparison, experiments were also executed on the 50 most common codes and the MIMIC-II dataset. In the 50-label setting, CAML continues to surpass prior models significantly. The interpretability of the attention mechanism was also validated through a qualitative evaluation by a physician, who rated the attentional snippets as informative and appropriately relevant to the assigned code.

Implications and Future Directions

The implications of this research are profound both practically and theoretically. The model's ability to predict ICD codes with improved accuracy and interpretability holds promise for real-world clinical applications, potentially easing the coding burden on healthcare professionals. The attention-based methodology not only enhances prediction efficiency but also provides meaningful insights into the decision-making process, a critical aspect of decision support systems in healthcare.

Future directions include integrating the hierarchical structure of ICD codes to enhance the prediction accuracy further and extending the application to other multi-label text classification tasks beyond the medical domain. The adaptability of the CAML architecture suggests its potential applicability to any domain characterized by complex taxonomies or label structures.

In conclusion, this paper presents a significant advancement in the automatic prediction of medical codes from clinical text by leveraging a sophisticated attentional mechanism within a CNN framework. The improvements in both predictive performance and interpretability highlight the model's potential for practical deployment in medical coding and other areas requiring robust multi-label classifications.

PDF Markdown