Image to Images Translation for Multi-Task Organ Segmentation and Bone Suppression in Chest X-Ray Radiography (1906.10089v2)

Published 24 Jun 2019 in eess.IV and cs.CV

Abstract: Chest X-ray radiography is one of the earliest medical imaging technologies and remains one of the most widely-used for diagnosis, screening, and treatment follow up of diseases related to lungs and heart. The literature in this field of research reports many interesting studies dealing with the challenging tasks of bone suppression and organ segmentation but performed separately, limiting any learning that comes with the consolidation of parameters that could optimize both processes. This study, and for the first time, introduces a multitask deep learning model that generates simultaneously the bone-suppressed image and the organ-segmented image, enhancing the accuracy of tasks, minimizing the number of parameters needed by the model and optimizing the processing time, all by exploiting the interplay between the network parameters to benefit the performance of both tasks. The architectural design of this model, which relies on a conditional generative adversarial network, reveals the process on how the well-established pix2pix network (image-to-image network) is modified to fit the need for multitasking and extending it to the new image-to-images architecture. The developed source code of this multitask model is shared publicly on Github as the first attempt for providing the two-task pix2pix extension, a supervised/paired/aligned/registered image-to-images translation which would be useful in many multitask applications. Dilated convolutions are also used to improve the results through a more effective receptive field assessment. The comparison with state-of-the-art algorithms along with ablation study and a demonstration video are provided to evaluate efficacy and gauge the merits of the proposed approach.

Citations (68)

View on Semantic Scholar

Summary

Image-to-Images Translation for Multi-Task Organ Segmentation and Bone Suppression in Chest X-Ray Radiography

The paper, "Image-to-Images Translation for Multi-Task Organ Segmentation and Bone Suppression in Chest X-Ray Radiography," introduces an innovative approach to enhance medical imaging analysis, leveraging a multitask deep learning model. This paper addresses the existing limitations in chest radiography associated with separate handling of bone suppression and organ segmentation, underscoring the inefficiencies arising from isolated processing. For the first time in this field, the researchers utilize a novel multitask framework based on a modified pix2pix network to simultaneously perform both tasks, thus optimizing the model's parameter usage and computational efficiency.

Technical Framework

The architectural foundation of the proposed solution is built upon a conditional generative adversarial network (cGAN). The pix2pix network, renowned for its application in image-to-image translation, has been enhanced to support what the authors term "image-to-images" translation. This implies the simultaneous generation of two distinct types of output images from a single input chest X-ray (CXR)—one with suppressed bones and another depicting segmented organs. The integration of dilated convolutions into specific layers of the generator further refines the receptive field, improving the accuracy of segmentation and suppression tasks.

Empirical Evaluation and Results

The researchers applied their multitask model to the JSRT dataset, which includes ground-truth masks for organ segmentation and bone-suppressed images. They performed rigorous cross-validation using standard metrics such as Dice, Jaccard, false-negative rate, and MSSIM, achieving superior results compared to standalone models like U-net and AutoEncoder. For example, the multitask framework demonstrated an average Dice score of 0.985 for organ segmentation—higher than other state-of-the-art methods—and achieved an MSSIM of 0.976 for bone suppression. The implemented solution not only improved task-specific accuracy but also reduced the model complexity, as it required fewer parameters compared to running separate models for each task.

Implications and Future Work

The findings from this research have significant implications for the field of radiology and beyond:

Clinical Efficiency: By automating and improving the accuracy of organ segmentation and bone suppression, healthcare providers can achieve faster and more reliable diagnoses. This can mitigate the high workload on radiologists, potentially reducing diagnostic errors.
Computational Optimization: The introduction of an efficient multitask model signals a shift towards reducing computational demands and resource usage in AI-driven diagnostics. This efficiency makes it feasible to deploy complex models in practical clinical settings, including devices with limited computational power.
Generalization Across Modalities: Preliminary results from other applications, such as low-dose CT and neuroimaging, highlight the versatility of the image-to-images translation approach. Future research should explore the adaptation of this model to other high-resolution imaging tasks, potentially integrating it into multi-modal diagnostic frameworks.

Overall, the paper contributes a robust methodology to the domain of medical image processing, promoting further investigations into multitasking frameworks and advancing the capabilities of AI in diagnostics. The shared source codes on GitHub will undoubtedly support the research community in replicating and building upon these findings, fostering innovation towards more integrated and efficient AI models in healthcare. The researchers have laid a foundation that could spur future work in enhancing model adaptability and performance across various imaging modalities, with potential extensions to higher resolution processing and additional anatomical insights.