Synthetic Medical Images from Dual Generative Adversarial Networks (1709.01872v3)

Published 6 Sep 2017 in cs.CV

Abstract: Currently there is strong interest in data-driven approaches to medical image classification. However, medical imaging data is scarce, expensive, and fraught with legal concerns regarding patient privacy. Typical consent forms only allow for patient data to be used in medical journals or education, meaning the majority of medical data is inaccessible for general public research. We propose a novel, two-stage pipeline for generating synthetic medical images from a pair of generative adversarial networks, tested in practice on retinal fundi images. We develop a hierarchical generation process to divide the complex image generation task into two parts: geometry and photorealism. We hope researchers will use our pipeline to bring private medical data into the public domain, sparking growth in imaging tasks that have previously relied on the hand-tuning of models. We have begun this initiative through the development of SynthMed, an online repository for synthetic medical images.

PDF Abstract

Overview of "Synthetic Medical Images from Dual Generative Adversarial Networks"

The presented paper proposes an innovative pipeline for the generation of synthetic medical images utilizing dual Generative Adversarial Networks (GANs). This method addresses pivotal challenges in the field of medical imaging research, such as data scarcity, privacy issues, and the limitations of public datasets. Using retinal fundus images as a test case, the authors delineate a process that separates the complex image generation task into simpler components: geometry and photorealism, mediated by a two-stage GAN framework.

Fundamentally, the pipeline comprises two GANs operating sequentially. The Stage-I GAN generates variable segmentation masks that capture geometrical diversity, while the Stage-II GAN translates these masks into photorealistic images. This architecture enhances the stability and quality of synthetic images compared to a single GAN, facilitated by the hierarchical decomposition of tasks.

Data and Methodology

The research employs retinal vessel segmentation images from the DRIVE dataset to train the Stage-I GAN, while the MESSIDOR dataset is used for the photorealistic image training in Stage-II. The authors highlight the use of a convolutional architecture (DCGAN) for the initial GAN, which eschews pooling layers to preserve essential medical image features. The conditional GAN (CGAN) architecture adapts the second stage to match segmentation masks with realistic images, derived from robust mappings achieved through high-dimensional learning.

Experimental Validation

For validation, the authors develop U-net segmentation networks trained on both synthetic and real datasets. An F1-score of 0.8877 is reported for networks trained with synthetic images, compared to 0.8988 for those trained with real images, suggesting comparable effectiveness.

The paper also evaluates the statistical dispersion of the datasets using Kullback-Leibler (KL) divergence, which indicated that synthetic data maintains the statistical properties of real data without direct duplication. This is crucial, as it demonstrates the generative system's capability to span novel data variants, contributing significantly to dataset volume and diversity without compromising patient privacy.

Discussion and Implications

The simplification of complex image generation tasks into a more stable two-stage process stands out as an effective methodological innovation. While qualitative evaluation supports the high fidelity of generated medical images, the broader implications of this work lie in its capacity to democratize medical data. By circumventing sensitive patient information with synthetic equivalents, research across the global scientific community can progress unfettered by proprietary data restrictions.

The paper posits that the synthetic datasets produced could spur advancements akin to those generated by ImageNet in computer vision. This potential alignment with public datasets may exponentially drive progress in automated medical diagnosis systems, improving efficiency and accessibility in medical imaging tasks.

Conclusion and Future Directions

The paper concludes by emphasizing the scalability of the dual-GAN pipeline across various data types beyond medical imaging. The authors discuss potential optimizations, including different representations for segmentation masks and advanced network architectures, to further enhance computational efficiency and application scope. They also highlight the importance of augmenting datasets with diverse real-world data to refine the generative models continuously.

Overall, this work provides a promising foundation for using synthetic data to overcome significant barriers in medical imaging research, potentially catalyzing novel developments across other complex data domains. Future research could focus on integrating these methods with larger datasets and exploring their applicability in dynamic and multifaceted imaging scenarios, particularly considering the complex nature of medical images and their diagnostic implications.

PDF Markdown Bookmark Chat (Pro)

Authors (3)

John T. Guibas (1 paper)
Tejpal S. Virdi (1 paper)
Peter S. Li (1 paper)

Citations (164)

View on Semantic Scholar