MosMedData: Chest CT Scans With COVID-19 Related Findings Dataset (2005.06465v1)

Published 13 May 2020 in cs.CY, cs.LG, and eess.IV

Abstract: This dataset contains anonymised human lung computed tomography (CT) scans with COVID-19 related findings, as well as without such findings. A small subset of studies has been annotated with binary pixel masks depicting regions of interests (ground-glass opacifications and consolidations). CT scans were obtained between 1st of March, 2020 and 25th of April, 2020, and provided by municipal hospitals in Moscow, Russia. Permanent link: https://mosmed.ai/datasets/covid19_1110. This dataset is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported (CC BY-NC-ND 3.0) License. Key words: artificial intelligence, COVID-19, machine learning, dataset, CT, chest, imaging

Authors (10)

S. P. Morozov (3 papers)
A. E. Andreychenko (2 papers)
N. A. Pavlov (1 paper)
A. V. Vladzymyrskyy (2 papers)
N. V. Ledikhova (1 paper)
V. A. Gombolevskiy (1 paper)
I. A. Blokhin (1 paper)
P. B. Gelezhe (1 paper)
A. V. Gonchar (1 paper)
V. Yu. Chernina (1 paper)

Citations (235)

View on Semantic Scholar

Summary

The paper presents a comprehensive collection of 1110 annotated chest CT scans categorized into five severity levels.
It includes binary pixel masks for a subset of scans, enabling precise segmentation of COVID-19 related abnormalities.
The dataset supports the rapid development and validation of AI systems aimed at efficient and accurate COVID-19 diagnosis.

Overview of MOSMEDDATA: Chest CT Scans with COVID-19 Related Findings Dataset

The paper "MOSMEDDATA: Chest CT Scans with COVID-19 Related Findings Dataset" presents a comprehensive collection of anonymized human lung computed tomography (CT) scans, addressing urgent needs that arose during the COVID-19 pandemic. Developed by the Research and Practical Clinical Center of Diagnostics and Telemedicine Technologies, Department of Health Care of Moscow, this dataset serves as a crucial resource for the development and validation of AI algorithms in medical imaging, specifically for detecting COVID-19 related abnormalities.

Dataset Composition and Annotation

The dataset comprises 1110 CT scans collected between March 1, 2020, and April 25, 2020, from municipal hospitals in Moscow, Russia. These scans are categorized into five severity levels: CT-0 through CT-4, based on the extent of lung tissue abnormalities and additional clinical criteria. Key findings include ground-glass opacities and pulmonary consolidations, which are well-documented indicators of COVID-19 pathology. A noteworthy aspect of this dataset is the accompanying binary pixel masks for a subset of 50 scans, outlining regions of interest such as ground-glass opacifications, thus aiding in model training and validation.

Technical Significance and AI Integration

Computed tomography has emerged as a pivotal diagnostic tool in the era of COVID-19, particularly in managing outpatient and inpatient cases. However, the increased reliance on CT imaging has exacerbated the workload for healthcare professionals, necessitating the role of AI in automating diagnostic processes. The MOSMEDDATA dataset supports AI algorithms that aim to triage patients efficiently, prioritize cases exhibiting COVID-19 symptoms, and perform high-quality assessments of pathological changes. Preliminary analyses indicate that AI holds promise in improving diagnostic evaluation, demonstrating responsiveness, specificity, and overall accuracy in discerning COVID-19 infections.

Implications and Future Perspectives

The deployment of datasets like MOSMEDDATA opens several trajectories for future research and enhancements in AI-driven medical diagnostics. The dataset not only provides a basis for developing and calibrating algorithms but also encourages their independent verification across varied datasets. As the COVID-19 pandemic has underscored the demand for rapid and accurate diagnostic tools, the contribution of such datasets is invaluable for streamlining clinical workflows and reducing human error in radiological assessments.

Given the persistence of the COVID-19 virus in various forms and the probable emergence of other respiratory pathogens, the synthesis of large, annotated datasets remains essential for advancing AI methodologies in medical imaging. Future developments may involve enhancing AI models' ability to generalize across different populations and imaging modalities, thereby fortifying the healthcare system's capacity to address pandemics efficiently. The MOSMEDDATA dataset is a substantive addition to the public domain that will likely inspire further innovations in computer-aided diagnostic systems.

PDF Markdown

Related Papers

YouTube

Show All Videos