Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
143 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Semantic segmentation of surgical hyperspectral images under geometric domain shifts (2303.10972v2)

Published 20 Mar 2023 in eess.IV, cs.CV, and cs.LG

Abstract: Robust semantic segmentation of intraoperative image data could pave the way for automatic surgical scene understanding and autonomous robotic surgery. Geometric domain shifts, however, although common in real-world open surgeries due to variations in surgical procedures or situs occlusions, remain a topic largely unaddressed in the field. To address this gap in the literature, we (1) present the first analysis of state-of-the-art (SOA) semantic segmentation networks in the presence of geometric out-of-distribution (OOD) data, and (2) address generalizability with a dedicated augmentation technique termed "Organ Transplantation" that we adapted from the general computer vision community. According to a comprehensive validation on six different OOD data sets comprising 600 RGB and hyperspectral imaging (HSI) cubes from 33 pigs semantically annotated with 19 classes, we demonstrate a large performance drop of SOA organ segmentation networks applied to geometric OOD data. Surprisingly, this holds true not only for conventional RGB data (drop of Dice similarity coefficient (DSC) by 46 %) but also for HSI data (drop by 45 %), despite the latter's rich information content per pixel. Using our augmentation scheme improves on the SOA DSC by up to 67 % (RGB) and 90 % (HSI) and renders performance on par with in-distribution performance on real OOD test data. The simplicity and effectiveness of our augmentation scheme makes it a valuable network-independent tool for addressing geometric domain shifts in semantic scene segmentation of intraoperative data. Our code and pre-trained models are available at https://github.com/IMSY-DKFZ/htc.

Citations (5)

Summary

  • The paper demonstrates that geometric domain shifts drastically reduce segmentation performance with marked DSC drops in both RGB and hyperspectral data.
  • The study introduces the Organ Transplantation augmentation method, which transplants organ-like regions to create diverse training scenarios for enhanced network generalization.
  • Extensive experiments on 600 pig image cubes across 19 tissue classes show improved DSC scores up to 67 for RGB and 90 for HSI, validating the approach.

Semantic Segmentation of Surgical Hyperspectral Images Under Geometric Domain Shifts

The presented paper addresses a significant challenge in the field of automatic semantic segmentation of intraoperative images: the impact of geometric domain shifts. These shifts are common during open surgeries due to variations in procedure and occlusions by surgical instruments, yet have been inadequately addressed in existing literature. This paper specifically examines the effect of geometric domain shifts on the performance of state-of-the-art (SOA) semantic segmentation networks and introduces a novel data augmentation method called "Organ Transplantation" to mitigate these issues.

Core Contributions

  1. Analysis of Performance Degradation Under Geometric Domain Shifts: The authors highlight that geometric domain shifts can severely impair the performance of SOA semantic segmentation networks. Through a comprehensive evaluation on six out-of-distribution (OOD) data sets consisting of RGB and hyperspectral imaging (HSI) data, the paper demonstrates significant drops in Dice Similarity Coefficient (DSC) for RGB (46) and HSI (45) data when encountering geometric OOD scenarios. This analysis reveals a considerable gap in the generalization capabilities of existing networks when applied to real-world surgical conditions.
  2. Introduction of the Organ Transplantation Augmentation Technique: Aiming to improve generalizability, the paper introduces an innovative augmentation strategy adapted from the broader computer vision domain. By transplanting organ-like regions between images, this technique creates diverse training scenarios that better prepare networks for geometric variations. Upon application, this augmentation showed notable improvements, enhancing the DSC by up to 67 for RGB data and 90 for HSI data, paralleling in-distribution performance even on OOD test data.

Experimental Design and Results

  • Data Collection:

A sizable dataset was compiled, comprising 600 hyperspectral image cubes from 33 pigs, annotated with 19 tissue classes. This dataset enabled an in-depth exploration of segmentation performance across varied OOD settings, including organs in isolation, organ resections, and situs occlusions.

  • Network Architecture:

The approach utilized a U-Net architecture with an efficientnet-b5 encoder, leveraging pre-trained weights on ImageNet, and optimized with stochastic weight averaging. The training incorporated traditional geometric transformations in conjunction with the new augmentation strategy.

  • Effects of Augmentation:

The results show that the Organ Transplantation technique consistently outperforms other augmentation methods in addressing geometric domain shifts. Across the board, HSI data demonstrated superior segmentation performance, asserting that its rich spectral content offers resilience against contextual limitations imposed by geometric shifts.

Implications and Future Directions

This research underscores the necessity of addressing geometric domain shifts for enhancing the reliability of surgical AI systems in real-world conditions. The proposed augmentation method offers a network-independent solution that is both simple and effective, highlighting its potential for broader application beyond the immediate surgical domain.

Future research could explore the combination of this augmentation with other robust training paradigms, such as domain adaptation and meta-learning, to further bolster the generalizability of surgical AI. Moreover, as HSI technology matures, its application in minimally invasive surgery could prove fruitful, extending the benefits observed in open surgery to laparoscopic procedures.

In conclusion, the paper provides a critical step forward in the pursuit of robust, reliable semantic segmentation within the highly variable domain of surgical imaging, offering a valuable tool to enhance the performance and utility of computer vision systems in medical applications.

Github Logo Streamline Icon: https://streamlinehq.com
Youtube Logo Streamline Icon: https://streamlinehq.com