Self-Supervised Learning for Cardiac MR Image Segmentation by Anatomical Position Prediction (1907.02757v1)

Published 5 Jul 2019 in cs.CV

Abstract: In the recent years, convolutional neural networks have transformed the field of medical image analysis due to their capacity to learn discriminative image features for a variety of classification and regression tasks. However, successfully learning these features requires a large amount of manually annotated data, which is expensive to acquire and limited by the available resources of expert image analysts. Therefore, unsupervised, weakly-supervised and self-supervised feature learning techniques receive a lot of attention, which aim to utilise the vast amount of available data, while at the same time avoid or substantially reduce the effort of manual annotation. In this paper, we propose a novel way for training a cardiac MR image segmentation network, in which features are learnt in a self-supervised manner by predicting anatomical positions. The anatomical positions serve as a supervisory signal and do not require extra manual annotation. We demonstrate that this seemingly simple task provides a strong signal for feature learning and with self-supervised learning, we achieve a high segmentation accuracy that is better than or comparable to a U-net trained from scratch, especially at a small data setting. When only five annotated subjects are available, the proposed method improves the mean Dice metric from 0.811 to 0.852 for short-axis image segmentation, compared to the baseline U-net.

Authors (9)

Wenjia Bai (80 papers)
Chen Chen (753 papers)
Giacomo Tarroni (27 papers)
Jinming Duan (48 papers)
Florian Guitton (5 papers)
Steffen E. Petersen (22 papers)
Yike Guo (144 papers)
Paul M. Matthews (14 papers)
Daniel Rueckert (335 papers)

Citations (168)

View on Semantic Scholar

Summary

Self-Supervised Learning for Cardiac MR Image Segmentation

The presented paper introduces a novel self-supervised learning (SSL) method to enhance the segmentation capabilities of cardiac MR images, focusing on the anatomical position prediction as the pretext task. This research aligns with the ongoing efforts to mitigate the dependency on large sets of manually annotated data in medical imaging tasks by exploiting unannotated data through SSL frameworks. This approach addresses the critical need in medical imaging for cost-effective and efficient data annotation methods.

Methodology and Contributions

The authors propose a cardiac MR image segmentation network trained via a self-supervised approach, wherein features are learned by predicting anatomical positions automatically defined by cardiac chamber view planes. Two primary contributions are outlined:

Anatomical Position Prediction: The proposed technique utilizes anatomical position prediction, defined through standard cardiac MR scan view planes as a pretext task. The methodology harnesses the rich spatial information encompassed in unannotated MR images, making the self-supervised learning paradigm applicable to a vast reservoir of unannotated clinical data stored in hospital Picture Archiving and Communication Systems (PACS).
Segmentation Performance: The paper benchmarks the segmentation performance against the traditional U-net architecture trained from scratch. On tasks involving short-axis and long-axis image segmentation, SSL shows a statistically significant increase in segmentation accuracy, particularly in regimes with limited annotated data. For instance, when presented with only five annotated subjects, the mean Dice metric improved from 0.811 to 0.852, which is indicative of the robust feature extraction capabilities of the self-supervised approach.

The self-training is accomplished using a U-net architecture, with cross-entropy loss guiding the optimization. Subsequently, transfer learning is applied to adapt the pretrained network to anatomical structure segmentation, through methods like SSL+Decoder and SSL+MultiTask strategies, with the latter performing optimally.

Experimental Results

The paper pioneers methods that scale well from being pretrained on as many as 3,825 subjects using unannotated data down to small annotated training sets. With as few as one labeled subject, the proposed self-supervised approaches still managed to segment key cardiac structures more accurately than a randomly initialized U-net. This is fundamental for tasks where acquiring labeled datasets is prohibitively expensive or infeasible.

Quantitative metrics such as the Dice metric and mean contour distance error consistently demonstrated performance gains across varying data availability settings. These experiments underscore SSL’s capacity to leverage implicit spatio-contextual features embedded within cardiac MR images without prior manual annotations.

Implications and Future Work

The compelling results presented affirm the viability of self-supervised learning solutions to counteract the scarcity of annotated medical data, with potential deployment in clinical settings due to the reliance on readily available standard imaging protocols. Moreover, the increase in segmentation accuracy even with minimal annotated data suggests promising applications in domains where data privacy and sensitivity prevent large-scale labeling.

Future research could delve into diversifying anatomical pretext tasks to capture a broader spectrum of anatomical and functional features, thus enhancing generalization. Moreover, this self-supervised framework can encourage exploration into multimodal SSL approaches, integrating auxiliary data sources to reinforce segmentation fidelity.

In conclusion, through the application of self-supervised paradigms, the presented research delivers promising strategies to improve cardiac MR segmentation tasks, paving the way for more efficient and widely applicable medical imaging solutions. Such methodologies set a precedent for advancing feature learning in radiology, particularly under constraints of limited annotation capacities.