Federated Semi-Supervised Learning for COVID Region Segmentation in Chest CT using Multi-National Data from China, Italy, Japan (2011.11750v1)

Published 23 Nov 2020 in eess.IV and cs.CV

Abstract: The recent outbreak of COVID-19 has led to urgent needs for reliable diagnosis and management of SARS-CoV-2 infection. As a complimentary tool, chest CT has been shown to be able to reveal visual patterns characteristic for COVID-19, which has definite value at several stages during the disease course. To facilitate CT analysis, recent efforts have focused on computer-aided characterization and diagnosis, which has shown promising results. However, domain shift of data across clinical data centers poses a serious challenge when deploying learning-based models. In this work, we attempt to find a solution for this challenge via federated and semi-supervised learning. A multi-national database consisting of 1704 scans from three countries is adopted to study the performance gap, when training a model with one dataset and applying it to another. Expert radiologists manually delineated 945 scans for COVID-19 findings. In handling the variability in both the data and annotations, a novel federated semi-supervised learning technique is proposed to fully utilize all available data (with or without annotations). Federated learning avoids the need for sensitive data-sharing, which makes it favorable for institutions and nations with strict regulatory policy on data privacy. Moreover, semi-supervision potentially reduces the annotation burden under a distributed setting. The proposed framework is shown to be effective compared to fully supervised scenarios with conventional data sharing instead of model weight sharing.

Authors (20)

Dong Yang (163 papers)
Ziyue Xu (58 papers)
Wenqi Li (59 papers)
Andriy Myronenko (39 papers)
Holger R. Roth (56 papers)
Stephanie Harmon (9 papers)
Sheng Xu (106 papers)
Baris Turkbey (24 papers)
Evrim Turkbey (12 papers)
Xiaosong Wang (42 papers)
Wentao Zhu (73 papers)
Gianpaolo Carrafiello (3 papers)
Francesca Patella (3 papers)
Maurizio Cariati (2 papers)
Hirofumi Obinata (2 papers)
Hitoshi Mori (12 papers)
Kaku Tamura (1 paper)
Peng An (12 papers)
Bradford J. Wood (9 papers)
Daguang Xu (91 papers)

Citations (207)

View on Semantic Scholar

Summary

The paper introduces a federated learning framework that integrates semi-supervised methods to effectively combine labeled and unlabeled data for COVID region segmentation.
It employs a 3D U-shape Fully Convolutional Network on multinational CT scans from China, Italy, and Japan to address domain shifts and privacy concerns.
Results demonstrate improved segmentation accuracy and model generalizability across diverse datasets, offering a scalable solution for healthcare diagnostics.

Federated Semi-Supervised Learning for COVID Region Segmentation in Chest CT Scans

The increasing global demand for reliable diagnostic tools for COVID-19 has amplified the significance of medical imaging, particularly chest CT scans, in identifying key disease characteristics. Despite advancements, deploying deep learning models across different clinical settings remains challenging due to domain shifts, data privacy concerns, and annotation variability. This paper addresses these challenges by proposing an innovative approach combining federated and semi-supervised learning to enhance COVID-19 region segmentation in chest CT scans using multinational datasets.

The data utilized in this paper comprises 1704 scans sourced from three countries: China, Italy, and Japan. Out of these, 945 scans have been meticulously annotated by expert radiologists to delineate COVID-19 infected regions. The paper highlights the disparities in data acquisition protocols and patient demographics across these diverse geographic locations.

Key Contributions

Federated Learning Framework: The paper introduces a federated learning (FL) system that allows institutions to collaboratively train deep learning models without sharing sensitive data, thereby aligning with stringent data privacy regulations. This decentralized approach permits model weight sharing while maintaining data confidentiality.
Semi-Supervised Learning Integration: The integration of semi-supervised learning within the FL framework enables effective utilization of both labeled and unlabeled data. This is particularly beneficial in scenarios where acquiring annotated data is challenging due to variations in resources and expertise.
Robustness to Domain Shifts: The novel approach is tested for its resilience to domain shifts. The FL framework significantly improves generalizability by leveraging diverse datasets without compromising data privacy. Notably, the paper demonstrates that the FL model trained using information from multiple datasets exhibits superior performance compared to models trained on individual datasets.

Methodology

The researchers employ a 3D U-shape Fully Convolutional Network (FCN) as the base architecture for COVID-19 affected region segmentation. Federated learning is leveraged to synchronize model weights across international sites, thus avoiding the need for sensitive data sharing. Furthermore, the semi-supervised learning aspect ensures that unlabeled data from different clients are effectively integrated into the model, bridging data gaps and enhancing performance.

Experimental Observations

The paper conducts extensive experiments, outlining various configuration settings for optimization. It is observed that federated learning with semi-supervised components yields better performance compared to conventional methods, particularly when large, diverse datasets are involved. Model generalizability is substantially improved across different geographic cohorts. The framework also exhibits flexibility across various base architectures and loss functions, enhancing its applicability to different medical imaging tasks.

Implications and Future Directions

The proposed federated semi-supervised learning framework has significant implications for cross-institutional collaboration in medical image analysis under stringent privacy constraints. It presents a scalable solution for enhancing diagnostic capabilities on a global scale, particularly in the context of pandemics like COVID-19 where rapid deployment of robust diagnostic tools is critical.

The authors suggest that future developments could focus on further refining federated learning techniques to address challenges associated with non-iid data distributions and partial weight updates. Moreover, extending this framework to other medical imaging tasks could catalyze broader advancements in automated disease detection and characterization.

In conclusion, this paper provides compelling evidence for the efficacy of federated, semi-supervised learning in handling domain shifts and minimizing annotation dependencies while complying with privacy regulations, potentially paving the way for innovative AI applications in healthcare.

PDF Markdown