AbdomenCT-1K: Is Abdominal Organ Segmentation A Solved Problem? (2010.14808v2)

Published 28 Oct 2020 in cs.CV

Abstract: With the unprecedented developments in deep learning, automatic segmentation of main abdominal organs seems to be a solved problem as state-of-the-art (SOTA) methods have achieved comparable results with inter-rater variability on many benchmark datasets. However, most of the existing abdominal datasets only contain single-center, single-phase, single-vendor, or single-disease cases, and it is unclear whether the excellent performance can generalize on diverse datasets. This paper presents a large and diverse abdominal CT organ segmentation dataset, termed AbdomenCT-1K, with more than 1000 (1K) CT scans from 12 medical centers, including multi-phase, multi-vendor, and multi-disease cases. Furthermore, we conduct a large-scale study for liver, kidney, spleen, and pancreas segmentation and reveal the unsolved segmentation problems of the SOTA methods, such as the limited generalization ability on distinct medical centers, phases, and unseen diseases. To advance the unsolved problems, we further build four organ segmentation benchmarks for fully supervised, semi-supervised, weakly supervised, and continual learning, which are currently challenging and active research topics. Accordingly, we develop a simple and effective method for each benchmark, which can be used as out-of-the-box methods and strong baselines. We believe the AbdomenCT-1K dataset will promote future in-depth research towards clinical applicable abdominal organ segmentation methods. The datasets, codes, and trained models are publicly available at https://github.com/JunMa11/AbdomenCT-1K.

View on arXiv

Authors (17)

Jun Ma (347 papers)
Yao Zhang (537 papers)
Song Gu (4 papers)
Cheng Zhu (19 papers)
Cheng Ge (11 papers)
Yichi Zhang (184 papers)
Xingle An (2 papers)
Congcong Wang (31 papers)
Qiyuan Wang (17 papers)
Xin Liu (820 papers)
Shucheng Cao (1 paper)
Qi Zhang (785 papers)
Shangqing Liu (28 papers)
Yunpeng Wang (40 papers)
Yuhui Li (15 papers)
Jian He (50 papers)
Xiaoping Yang (33 papers)

Citations (283)

View on Semantic Scholar

Summary

Insights into Abdominal Organ Segmentation: A Detailed Analysis of the AbdomenCT-1K Dataset

The field of medical imaging has been significantly transformed by deep learning methodologies, with organ segmentation from CT scans being a pivotal area of application. The paper "AbdomenCT-1K: Is Abdominal Organ Segmentation A Solved Problem?" takes a critical stance on the status of current segmentation methods by introducing a novel, extensive dataset named AbdomenCT-1K and examines the limitations of mainstream models when subjected to diverse datasets. This paper contributes substantial insights into the generalizability and robustness of segmentation algorithms, indicating a reassessment of what is considered 'solved' in this domain.

Dataset Construction and Annotation

The AbdomenCT-1K dataset is the cornerstone of this investigation, comprising over 1000 CT scans from 12 different medical centers. This dataset uniquely includes multi-vendor, multi-phase, and multi-disease cases, thus offering a more varied and challenging environment for evaluating organ segmentation algorithms. The extensive annotation process involved junior annotators supervised by experienced radiologists, ensuring a high level of accuracy and consistency across the dataset. Such meticulous annotation is essential to provide a reliable benchmark for assessing the efficacy of segmentation methods.

Evaluation of State-of-the-Art Methods

A critical examination was conducted using the dataset to scrutinize the performance of state-of-the-art methods such as nnU-Net. The paper reveals that while high Dice Similarity Coefficient (DSC) scores can be achieved when training and testing within the same dataset or condition, the generalization ability of models significantly drops when assessed on datasets from different centers with varied conditions. This discrepancy underscores the limitations in the current methodologies where variations in scanner type, phase, or patient conditions may affect the model's adaptability, challenging the notion that organ segmentation is a fully solved problem.

Benchmark Development

The researchers went beyond identifying the problem by establishing new segmentation benchmarks in four challenging tasks: fully supervised, semi-supervised, weakly supervised, and continual learning. These tasks reflect active research areas seeking to enhance learning efficiency and generalization capabilities. The benchmarks serve as a comprehensive platform for evaluating models on more realistic, diverse, and clinically relevant tasks. Moreover, they emphasize the critical role of not only DSC but also normalized surface Dice (NSD) as an evaluation metric, acknowledging the importance of accurate boundary delineation in clinical applications such as surgical planning.

Baseline Solutions and Future Directions

For each benchmark, baseline solutions employing state-of-the-art methods were developed. In particular, a novel approach using nnU-Net was tailored for semi-supervised and weakly supervised tasks, demonstrating the potential to employ unannotated or sparsely labeled data effectively. Although significant progress was observed, particularly in full supervision scenarios, the persistent challenges highlighted the need for future research.

This work implies that research in abdominal organ segmentation must consider more than just improving segmentation algorithms in isolation. Models need testing across diverse datasets to ensure robustness, paying keen attention to variations that simulate real-world scenarios. The introduction of AbdomenCT-1K and the corresponding benchmarks offer the community a new avenue for exploring the unsolved aspects of organ segmentation, potentially steering future advancements towards achieving generalizable and clinically applicable solutions.

The paper’s findings stress that continual innovation is necessary in designing adaptable models capable of handling complexities introduced by variation and noise inherent in clinical data. Consequently, collaborations between machine learning experts and clinical practitioners must further integrate domain-specific knowledge into the development of such models, thereby closing the gap towards more reliable medical image interpretation systems.

PDF Markdown

Related Papers

Find Related Papers

GitHub

GitHub - JunMa11/AbdomenCT-1K: The official repository of "AbdomenCT-1K: Is Abdominal Organ Segmentation A Solved Problem?" (226 stars)