Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Low Dose CT Image Denoising Using a Generative Adversarial Network with Wasserstein Distance and Perceptual Loss (1708.00961v2)

Published 3 Aug 2017 in cs.CV

Abstract: In this paper, we introduce a new CT image denoising method based on the generative adversarial network (GAN) with Wasserstein distance and perceptual similarity. The Wasserstein distance is a key concept of the optimal transform theory, and promises to improve the performance of the GAN. The perceptual loss compares the perceptual features of a denoised output against those of the ground truth in an established feature space, while the GAN helps migrate the data noise distribution from strong to weak. Therefore, our proposed method transfers our knowledge of visual perception to the image denoising task, is capable of not only reducing the image noise level but also keeping the critical information at the same time. Promising results have been obtained in our experiments with clinical CT images.

Citations (1,111)

Summary

  • The paper presents a novel GAN-based denoising technique that integrates Wasserstein distance and perceptual loss to enhance CT image fidelity.
  • Experimental results on NIH-AAPM-Mayo Clinic data show that the WGAN-VGG model outperforms traditional methods by preserving structural details and reducing artifacts.
  • The approach provides a promising framework for balancing noise reduction and diagnostic quality in low-dose CT imaging, paving the way for future research innovations.

Low Dose CT Image Denoising Using a Generative Adversarial Network with Wasserstein Distance and Perceptual Loss

The continuous deployment of computed tomography (CT) in medical diagnostics has sparked concerns regarding patient radiation exposure. While reducing the dose of CT radiation is one mitigative approach, it often results in increased noise and artifacts within the images, posing challenges to accurate diagnosis. Addressing this issue, the paper presents an advanced method for denoising low-dose CT (LDCT) images, leveraging a Generative Adversarial Network (GAN) augmented with Wasserstein distance and perceptual loss.

Introduction and Background

CT imaging continues to be a cornerstone in clinical diagnostics, yet the associated radiation risks necessitate strategies to minimize patient exposure. Conventional methods for LDCT image denoising include sinogram filtration, iterative reconstruction (IR), and image post-processing. Sinogram filtration operates in the projection domain before image reconstruction but is often limited by the availability of sinogram data and the potential for resolution loss. IR methods optimize objective functions integrating system models, noise statistics, and image priors like total variation and dictionary learning, offering improved image quality at the cost of high computational demands. Image post-processing methods, although computationally efficient, generally struggle with over-smoothing and residual artifacts.

Emerging deep learning techniques, particularly Convolutional Neural Networks (CNNs), have shown significant promise for LDCT denoising by extending methods like image super-resolution to CT imaging. However, traditional loss functions such as Mean Squared Error (MSE) employed in these networks can lead to over-smoothed images and loss of critical structural details.

Methodology

The paper proposes a sophisticated denoising approach using a GAN framework enhanced with Wasserstein distance and perceptual loss (WGAN-VGG). The generative network (G) maps LDCT to NDCT images, while the discriminator (D) differentiates between real and generated images, striving towards minimizing the Wasserstein distance between the LDCT and NDCT distributions. Here, the Earth-Mover (EM) distance or Wasserstein metric replaces the Jensen-Shannon (JS) divergence, improving stability and convergence.

Additionally, perceptual loss, calculated using a pre-trained VGG-19 network, is integrated into the framework. This form of loss leverages feature representations learned from a large dataset of natural images, aligning closer with human visual perception than traditional MSE. By comparing features rather than pixel-wise differences, the perceptual loss helps preserve image details and structures critical for diagnosis.

The combined loss function thus includes the WGAN loss to ensure distributional similarity and the VGG-based perceptual loss to retain image fidelity, expressed as: minGmaxD[WGAN loss+λ1Perceptual loss]\min_G \max_D [\text{WGAN loss} + \lambda_1 \text{Perceptual loss}]

Experimental Results

The experimental evaluation was conducted on clinical data from the NIH-AAPM-Mayo Clinic Low Dose CT Grand Challenge, containing both normal-dose and simulated quarter-dose CT images. Training involved over 100,000 image patches, utilizing the Adam optimizer with fine-tuned hyperparameters.

The proposed method (WGAN-VGG) demonstrated superior performance in qualitative and quantitative assessments compared to traditional and other deep learning-based denoising approaches. Unlike MSE-based networks that achieved higher PSNR values at the cost of over-smoothing, WGAN-VGG preserved essential image details and reduced artifacts, as indicated by subjective evaluations from radiologists.

Discussion

The improved denoising capability of the WGAN-VGG network underscores the significance of employing perceptual loss in preserving diagnostic quality. By transferring knowledge from the VGG network trained on natural images, the proposed method effectively reduces noise while maintaining critical anatomical details. Further, the Wasserstein metric's use within the GAN framework ensures a robust distributional transformation from LDCT to NDCT images.

Despite the promising results, the approach requires re-tuning for datasets with different noise characteristics. Future work could integrate more complex generator architectures to further enhance performance and explore direct raw-data-to-image network pipelines, potentially bypassing information loss inherent in the FBP reconstruction process.

Conclusion

This paper presents a robust, advanced technique for LDCT image denoising using a WGAN framework integrated with perceptual loss. The proposed method demonstrates exceptional potential by effectively balancing noise reduction and detail preservation, addressing a critical need in radiological diagnostics. Future research directions include extending this approach to more generalized CT image reconstruction and adapting to various noise conditions, promising further advancements in medical imaging techniques.