Deep Learning for Segmentation using an Open Large-Scale Dataset in 2D Echocardiography (1908.06948v2)

Published 16 Aug 2019 in eess.IV

Abstract: Delineation of the cardiac structures from 2D echocardiographic images is a common clinical task to establish a diagnosis. Over the past decades, the automation of this task has been the subject of intense research. In this paper, we evaluate how far the state-of-the-art encoder-decoder deep convolutional neural network methods can go at assessing 2D echocardiographic images, i.e segmenting cardiac structures as well as estimating clinical indices, on a dataset especially designed to answer this objective. We therefore introduce the Cardiac Acquisitions for Multi-structure Ultrasound Segmentation (CAMUS) dataset, the largest publicly-available and fully-annotated dataset for the purpose of echocardiographic assessment. The dataset contains two and four-chamber acquisitions from 500 patients with reference measurements from one cardiologist on the full dataset and from three cardiologists on a fold of 50 patients. Results show that encoder-decoder based architectures outperform state-of-the-art non-deep learning methods and faithfully reproduce the expert analysis for the end-diastolic and end-systolic left ventricular volumes, with a mean correlation of 0.95 and an absolute mean error of 9.5 ml. Concerning the ejection fraction of the left ventricle, results are more contrasted with a mean correlation coefficient of 0.80 and an absolute mean error of 5.6 %. Although these results are below the inter-observer scores, they remain slightly worse than the intra-observer's ones. Based on this observation, areas for improvement are defined, which open the door for accurate and fully-automatic analysis of 2D echocardiographic images.

Authors (14)

Sarah Leclerc (4 papers)
Erik Smistad (11 papers)
João Pedrosa (6 papers)
Andreas Østvik (7 papers)
Frederic Cervenansky (4 papers)
Florian Espinosa (3 papers)
Torvald Espeland (3 papers)
Erik Andreas Rye Berg (3 papers)
Pierre-Marc Jodoin (36 papers)
Thomas Grenier (5 papers)
Carole Lartizien (21 papers)
Jan D'hooge (7 papers)
Lasse Lovstakken (25 papers)
Olivier Bernard (34 papers)

Citations (442)

View on Semantic Scholar

Summary

The paper demonstrates that encoder-decoder CNNs, particularly U-Net variants, significantly outperform traditional segmentation methods with superior geometric and clinical accuracy.
It introduces the CAMUS dataset, the largest open collection of 2D echocardiographic images with expert annotations, to challenge model robustness across image quality variations.
The study emphasizes the potential of deep learning to automate cardiac image analysis, reducing clinician workload and enhancing diagnostic consistency.

Overview of Deep Learning for Segmentation Using an Open Large-Scale Dataset in 2D Echocardiography

The paper "Deep Learning for Segmentation Using an Open Large-Scale Dataset in 2D Echocardiography" explores the efficacy of state-of-the-art encoder-decoder convolutional neural network (CNN) techniques in the segmentation of cardiac structures in 2D echocardiography. The researchers introduce the Cardiac Acquisitions for Multi-structure Ultrasound Segmentation (CAMUS) dataset, which is the largest publicly available dataset of its kind. Their paper aims to address several pivotal questions regarding the performance of CNNs compared to non-deep learning techniques, the training data requirements for CNNs, and the accuracy of CNN-derived clinical indices such as left ventricular volumes.

The CAMUS Dataset

The CAMUS dataset encompasses a substantial number of echocardiographic examinations, consisting of 1000 images from 500 patients. The dataset is especially noteworthy for its inclusivity of both high and medium-quality images, which reflects real-world variability seen in clinical practice. Experts provided manual annotations of left ventricle endocardium, epicardium, and left atrium structures, both at end-diastole (ED) and end-systole (ES), to serve as ground truth references.

Comparison with State-of-the-art Methods

The paper evaluates the performance of four CNN-based encoder-decoder architectures—U-Net, ACNN, SHG, and U-Net++—against traditional non-deep learning methods such as Structured Random Forest (SRF) and B-Spline Explicit Active Surface Model (BEASM). Results indicate that encoder-decoder networks significantly outperform these non-deep learning approaches across multiple metrics. Specifically, U-Net variants achieved high accuracy in both geometric (e.g., Dice index, mean absolute distance) and clinical (e.g., volume estimation) evaluations.

Encoder-Decoder vs. Traditional Methods

Encoder-decoder networks, especially the U-Net architecture, demonstrated exceptional segmentation quality, echoing the efficiency and adaptability of deep learning models in medical image analysis. U-Net, with a relatively low number of trainable parameters, emerges as a notably efficient model, balancing speed and accuracy. The model was comparable to more complex architectures like ACNN and SHG, which suggests a potential plateau in the benefits achieved from increased architectural sophistication for the task of 2D echocardiographic segmentation.

Insights from Variability Analysis

Observations from inter- and intra-observer variation underscore the challenge in echocardiographic segmentation. The inter-observer Dice scores ranged significantly, indicating a substantial challenge in achieving consistent manual annotations. Despite these challenges, encoder-decoder models showed promise by performing well within inter-observer variability. However, their performance did not fully match intra-observer variability, hinting at opportunities for further fine-tuning.

Practical and Theoretical Implications

The paper underscores the growing potential for deep learning models in medical imaging, particularly their capacity to automate cardiac image analysis. This technology could drastically reduce the time burden on clinicians while increasing diagnostic consistency. The CAMUS dataset also sets a precedent for the provision of well-curated, large-scale datasets essential for advancing machine learning in healthcare. Moreover, investigating techniques that integrate temporal coherence—such as recurrent neural networks—may enhance the estimation of clinical indices like ejection fraction.

Future Directions

Advancements in neural network architectural design, training strategies, and deployment in diverse clinical settings present avenues for further research. Developing methods that incorporate temporal continuity could improve analysis accuracy during dynamic cardiac phases like systole and diastole. Additionally, expanding datasets to incorporate multi-vendor and multi-center data could increase generalization and robustness in clinical applications.

The implications of this work are significant for both the immediate clinical application and the broader scope of machine learning research in healthcare. Further exploration into these techniques holds the promise of transformative impact on diagnostic processes in echocardiographic imaging.

PDF Markdown