Overview of Explainable Automated Coding of Clinical Notes
The paper proposes a novel approach to explainable automated coding of clinical notes leveraging Hierarchical Label-wise Attention Networks (HLAN) and Label Embedding Initialisation. The focus is on improving the efficiency and accuracy of medical coding through automation, while addressing critical challenges related to model explainability and label correlations. Automated coding promises to reduce the substantial manual workload involved in traditional diagnostic and procedural coding, but its adoption is impeded by both the need for clear interpretability and the inherent correlations among medical codes.
Methodology
The authors introduce the Hierarchical Label-wise Attention Network (HLAN) to tackle the issue of poor model interpretability. Unlike previous approaches where attention mechanisms are shared across labels, HLAN employs label-wise attention mechanisms at both word and sentence levels. This design choice enables the model to quantify the importance of specific words and sentences with respect to each label, providing more granular and comprehensible interpretations. Additionally, the authors leverage a label embedding (LE) initialisation technique, which incorporates correlations among labels. This enhancement aims to substantiate the model's predictions by better capturing inter-label relationships encoded within clinical narratives.
Three distinct configurations for evaluation are employed using the MIMIC-III data sets: full codes, top-50 codes, and the UK NHS COVID-19 shielding codes. These setups allow for comprehensive assessment of the methods across varying label dimensions and clinical focus areas. Model performance is compared against a range of state-of-the-art techniques including Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs).
Results
Experimental results demonstrate that HLAN achieves superior micro-level AUC and F1 scores in the top-50 code setting, with 91.9% and 64.1%, respectively. For the NHS COVID-19 shielding codes, it reports a micro-level AUC of around 97%, comparable to other models. Notably, HLAN furnishes a more detailed and meaningful interpretation by highlighting key semantic aspects specific to each label, thereby improving the explainability compared to CNN-based counterparts.
The label embedding initialisation provides a notable boost, improving the previous state-of-the-art CNN with attention mechanisms on full code prediction to a micro-level F1 of 52.5%. These significant results underscore the potency of embedding label correlations to enhance model efficacy.
Implications and Future Directions
The paper's findings have significant implications for the practical deployment and utilization of AI-driven coding systems in healthcare settings. The enhanced interpretability provided by HLAN could bolster clinician trust and facilitate more informed decision-making processes, reducing the likelihood of erroneous code assignments. Moreover, the incorporation of label correlations through LE initialisation could lead to more accurate predictive performance, particularly in complex multi-label environments.
Future work may explore scaling methodologies for HLAN to broader label sets, integrating external clinical ontologies, and refining training processes to capture rare or emerging labels—strategies essential for bolstering robustness and adaptability of automated coding systems in dynamic clinical contexts. These developments would further drive the conversation about AI's role in healthcare and redefine capabilities in managing large-scale biomedical information more efficiently.