- The paper demonstrates that geometric domain shifts drastically reduce segmentation performance with marked DSC drops in both RGB and hyperspectral data.
- The study introduces the Organ Transplantation augmentation method, which transplants organ-like regions to create diverse training scenarios for enhanced network generalization.
- Extensive experiments on 600 pig image cubes across 19 tissue classes show improved DSC scores up to 67 for RGB and 90 for HSI, validating the approach.
Semantic Segmentation of Surgical Hyperspectral Images Under Geometric Domain Shifts
The presented paper addresses a significant challenge in the field of automatic semantic segmentation of intraoperative images: the impact of geometric domain shifts. These shifts are common during open surgeries due to variations in procedure and occlusions by surgical instruments, yet have been inadequately addressed in existing literature. This paper specifically examines the effect of geometric domain shifts on the performance of state-of-the-art (SOA) semantic segmentation networks and introduces a novel data augmentation method called "Organ Transplantation" to mitigate these issues.
Core Contributions
- Analysis of Performance Degradation Under Geometric Domain Shifts: The authors highlight that geometric domain shifts can severely impair the performance of SOA semantic segmentation networks. Through a comprehensive evaluation on six out-of-distribution (OOD) data sets consisting of RGB and hyperspectral imaging (HSI) data, the paper demonstrates significant drops in Dice Similarity Coefficient (DSC) for RGB (46) and HSI (45) data when encountering geometric OOD scenarios. This analysis reveals a considerable gap in the generalization capabilities of existing networks when applied to real-world surgical conditions.
- Introduction of the Organ Transplantation Augmentation Technique: Aiming to improve generalizability, the paper introduces an innovative augmentation strategy adapted from the broader computer vision domain. By transplanting organ-like regions between images, this technique creates diverse training scenarios that better prepare networks for geometric variations. Upon application, this augmentation showed notable improvements, enhancing the DSC by up to 67 for RGB data and 90 for HSI data, paralleling in-distribution performance even on OOD test data.
Experimental Design and Results
A sizable dataset was compiled, comprising 600 hyperspectral image cubes from 33 pigs, annotated with 19 tissue classes. This dataset enabled an in-depth exploration of segmentation performance across varied OOD settings, including organs in isolation, organ resections, and situs occlusions.
The approach utilized a U-Net architecture with an efficientnet-b5 encoder, leveraging pre-trained weights on ImageNet, and optimized with stochastic weight averaging. The training incorporated traditional geometric transformations in conjunction with the new augmentation strategy.
The results show that the Organ Transplantation technique consistently outperforms other augmentation methods in addressing geometric domain shifts. Across the board, HSI data demonstrated superior segmentation performance, asserting that its rich spectral content offers resilience against contextual limitations imposed by geometric shifts.
Implications and Future Directions
This research underscores the necessity of addressing geometric domain shifts for enhancing the reliability of surgical AI systems in real-world conditions. The proposed augmentation method offers a network-independent solution that is both simple and effective, highlighting its potential for broader application beyond the immediate surgical domain.
Future research could explore the combination of this augmentation with other robust training paradigms, such as domain adaptation and meta-learning, to further bolster the generalizability of surgical AI. Moreover, as HSI technology matures, its application in minimally invasive surgery could prove fruitful, extending the benefits observed in open surgery to laparoscopic procedures.
In conclusion, the paper provides a critical step forward in the pursuit of robust, reliable semantic segmentation within the highly variable domain of surgical imaging, offering a valuable tool to enhance the performance and utility of computer vision systems in medical applications.