Projected Distribution Loss for Image Enhancement (2012.09289v2)

Published 16 Dec 2020 in cs.CV and eess.IV

Abstract: Features obtained from object recognition CNNs have been widely used for measuring perceptual similarities between images. Such differentiable metrics can be used as perceptual learning losses to train image enhancement models. However, the choice of the distance function between input and target features may have a consequential impact on the performance of the trained model. While using the norm of the difference between extracted features leads to limited hallucination of details, measuring the distance between distributions of features may generate more textures; yet also more unrealistic details and artifacts. In this paper, we demonstrate that aggregating 1D-Wasserstein distances between CNN activations is more reliable than the existing approaches, and it can significantly improve the perceptual performance of enhancement models. More explicitly, we show that in imaging applications such as denoising, super-resolution, demosaicing, deblurring and JPEG artifact removal, the proposed learning loss outperforms the current state-of-the-art on reference-based perceptual losses. This means that the proposed learning loss can be plugged into different imaging frameworks and produce perceptually realistic results.

Citations (33)

View on Semantic Scholar

Summary

The paper introduces Projected Distribution Loss (PDL) for image enhancement, measuring feature distribution differences using the 1D-Wasserstein distance to improve perceptual quality.
PDL avoids common artifacts like unrealistic details often associated with traditional loss functions by comparing distributions of projected CNN features instead of direct feature norms.
Evaluated across various image restoration tasks, PDL consistently yields superior perceptual results verified by quantitative metrics and human studies, with minimal computational overhead.

Projected Distribution Loss for Image Enhancement

The paper "Projected Distribution Loss for Image Enhancement" by Mauricio Delbracio, Hossein Talebei, and Peyman Milanfar presents a novel approach to perceptual learning in image enhancement. The research focuses on improving the perceptual quality of images restored by deep learning models through a refined loss function that measures distances between distributions of features extracted by object recognition CNNs.

The core contribution of this work is the introduction of the Projected Distribution Loss (PDL), which leverages the 1D-Wasserstein distance to measure feature distribution differences. This approach is posited as more effective than traditional methods that rely on the norm of feature differences or adversarial losses. The authors demonstrate that PDL reduces the tendency to hallucinate unrealistic details, a common problem associated with directly minimizing the distance between high-dimensional feature spaces.

Key Methodological Details

Feature Space Comparison: Traditional perceptual losses often minimize L_n norms over pixel differences or feature activations which tend to lead to blurred or artifact-ridden outputs. The PDL instead aggregates 1D-Wasserstein distances calculated over projected CNN activations, thereby maintaining a balance between preserving real content and suppressing hallucinated details.
Multi-Task Proficiency: The paper explores the application of PDL across various image restoration tasks such as denoising, super-resolution, demosaicing, deblurring, and JPEG artifact removal. It consistently outperforms existing loss functions in producing perceptually realistic images as measured by quantitative metrics like PSNR and LPIPS, alongside qualitative human assessments.
Scalability and Implementation: PDL introduces minimal additional computational overhead. The necessary sorting operations for calculating the Wasserstein distance are efficiently manageable, making PDL easily integrable into existing deep learning pipelines without significantly affecting training times.

Strong Results and Implications

The efficacy of the PDL is validated through several benchmarks, where it shows superior performance in terms of perceptual quality compared to L1/L2 norms and contextual losses. For example, in scenarios with high noise levels, denoising models trained with PDL exhibit better LPIPS scores, indicating enhanced perceptual alignment with ground truth images.

Additionally, user studies embedded in the evaluation framework emphasize the human perceptual advantage of PDL-enhanced outputs. Such findings underscore the potential of PDL in consumer-facing imaging applications where perceptual quality is paramount, such as smartphone photography and video restoration.

Theoretical Implications and Future Directions

The introduction of optimal transport theory into the imaging loss function domain highlights a broader trend of cross-disciplinary methodologies enhancing deep learning models. The PDL's application of the Wasserstein distance exemplifies how optimal transport can serve as a powerful tool in managing the geometry of feature spaces.

Future work could focus on further optimizing feature projection strategies and examining alternative CNN architectures to harness more intricate geometric insights from the feature space. A promising direction could involve the dynamic selection of feature layers based on task-specific perceptual metrics, honing the integration of PDL in real-time processing systems.

In conclusion, the Projected Distribution Loss exhibits a compelling advancement in image restoration technology, offering a robust pathway to refining the perceptual quality of deep learning models through an elegant application of mathematical rigor. As imaging systems continue to evolve, such perceptual enhancements hold significant promise across multiple AI-related fields.

Related Papers

YouTube

Show All Videos