Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

CHAOS Challenge -- Combined (CT-MR) Healthy Abdominal Organ Segmentation (2001.06535v3)

Published 17 Jan 2020 in eess.IV and cs.CV

Abstract: Segmentation of abdominal organs has been a comprehensive, yet unresolved, research field for many years. In the last decade, intensive developments in deep learning (DL) have introduced new state-of-the-art segmentation systems. In order to expand the knowledge on these topics, the CHAOS - Combined (CT-MR) Healthy Abdominal Organ Segmentation challenge has been organized in conjunction with IEEE International Symposium on Biomedical Imaging (ISBI), 2019, in Venice, Italy. CHAOS provides both abdominal CT and MR data from healthy subjects for single and multiple abdominal organ segmentation. Five different but complementary tasks have been designed to analyze the capabilities of current approaches from multiple perspectives. The results are investigated thoroughly, compared with manual annotations and interactive methods. The analysis shows that the performance of DL models for single modality (CT / MR) can show reliable volumetric analysis performance (DICE: 0.98 $\pm$ 0.00 / 0.95 $\pm$ 0.01) but the best MSSD performance remain limited (21.89 $\pm$ 13.94 / 20.85 $\pm$ 10.63 mm). The performances of participating models decrease significantly for cross-modality tasks for the liver (DICE: 0.88 $\pm$ 0.15 MSSD: 36.33 $\pm$ 21.97 mm) and all organs (DICE: 0.85 $\pm$ 0.21 MSSD: 33.17 $\pm$ 38.93 mm). Despite contrary examples on different applications, multi-tasking DL models designed to segment all organs seem to perform worse compared to organ-specific ones (performance drop around 5\%). Besides, such directions of further research for cross-modality segmentation would significantly support real-world clinical applications. Moreover, having more than 1500 participants, another important contribution of the paper is the analysis on shortcomings of challenge organizations such as the effects of multiple submissions and peeking phenomena.

Overview of the CHAOS Challenge in Abdominal Organ Segmentation

The CHAOS challenge represents a significant contribution to the field of medical image analysis by addressing the complex task of abdominal organ segmentation across different imaging modalities, specifically CT and MR. This challenge was organized to evaluate and challenge current segmentation methodologies while providing a benchmark dataset for the wider research community. The paper discusses various tasks formulated to test the capabilities of participating deep learning (DL) models in both single and multi-modality contexts and elaborates on their performance and implications for future developments in medical image segmentation.

Methodology

The challenge introduced five tasks designed to assess DL models' segmentation performance on both CT and MRI data. These tasks include both single organ segmentation (liver) and multi-organ segmentation (liver, spleen, kidneys) across different modalities, including the challenging cross-modality setting that combines CT and MRI data.

Each model's effectiveness was measured using four metrics: DICE coefficient, Relative Absolute Volume Difference (RAVD), Average Symmetric Surface Distance (ASSD), and Maximum Symmetric Surface Distance (MSSD). These metrics were chosen to provide a comprehensive evaluation of the segmentation performance, covering aspects such as volumetric accuracy and spatial consistency.

Results

The participating teams largely employed U-Net variants and other convolutional neural network-based approaches, reflecting the dominance of these architectures in the domain. Ensembles of models demonstrated superior performance, particularly in single-modality tasks like CT liver segmentation (Task 2), where DICE scores approached inter-expert variability levels. However, tasks involving cross-modality (Task 1) and multi-modal segmentation (Task 4) presented substantial challenges, revealing the current limitations of DL models when trained on mixed data sources.

For multi-modal MR tasks, integrating MRI data from different sequences, the DL models performed reasonably well, though achieving higher performance on multi-organ tasks compared to single organ tasks remains an ongoing challenge. Importantly, current models showed robustness for volumetric measures but consistently underperformed on distance-based metrics, which are crucial for surgical applications.

Implications and Future Directions

The CHAOS challenge highlights several critical insights into the application of DL models for medical image segmentation:

  1. Model Generalization: Despite advances, cross-modality and multi-organ tasks revealed gaps in the generalization capabilities of DL models. Future research should explore domain adaptation strategies and more sophisticated architectures to bridge these gaps.
  2. Robustness and Scalability: Ensemble approaches demonstrated robustness, but issues such as scalability and computational cost need attention, especially when considering clinical deployment.
  3. Clinical Applicability: While DL models showed promise, the integration with clinical workflows requires further refinements. Future work should aim at developing deployable solutions that can operate under real-world conditions, accounting for variability in imaging protocols.
  4. Addressing Peeking: The challenge organizers brought to light issues like multiple submissions and peeking. Strategies to ensure fair evaluation, including potential restrictions or requirements for open-source submissions, are necessary to uphold the scientific integrity of challenge results.

The CHAOS dataset remains a valuable resource for the community, encouraging continued experimentation and development of innovative methods with the potential to improve segmentation accuracy and utility in clinical practice. As deep learning evolves, remaining challenges, especially those concerning complex, cross-modality tasks, will likely see new solutions grounded in the integration of DL models with more traditional image processing approaches.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (27)
  1. A. Emre Kavur (8 papers)
  2. N. Sinem Gezer (2 papers)
  3. Mustafa Barış (1 paper)
  4. Sinem Aslan (14 papers)
  5. Pierre-Henri Conze (38 papers)
  6. Vladimir Groza (3 papers)
  7. Duc Duy Pham (2 papers)
  8. Soumick Chatterjee (29 papers)
  9. Philipp Ernst (6 papers)
  10. Savaş Özkan (13 papers)
  11. Bora Baydar (3 papers)
  12. Dmitry Lachinov (4 papers)
  13. Josef Pauli (5 papers)
  14. Fabian Isensee (74 papers)
  15. Matthias Perkonigg (7 papers)
  16. Rachana Sathish (12 papers)
  17. Ronnie Rajan (7 papers)
  18. Debdoot Sheet (32 papers)
  19. Gurbandurdy Dovletov (2 papers)
  20. Oliver Speck (26 papers)
Citations (531)