Papers

Topics

Authors

Recent

View all

Gemini 2.5 Flash

102 tokens/sec

GPT-4o

59 tokens/sec

Gemini 2.5 Pro Pro

43 tokens/sec

o3 Pro

6 tokens/sec

GPT-4.1 Pro

50 tokens/sec

DeepSeek R1 via Azure Pro

28 tokens/sec

2000 character limit reached

Deep Neural Networks for No-Reference and Full-Reference Image Quality Assessment (1612.01697v2)

Published 6 Dec 2016 in cs.CV

Abstract: We present a deep neural network-based approach to image quality assessment (IQA). The network is trained end-to-end and comprises ten convolutional layers and five pooling layers for feature extraction, and two fully connected layers for regression, which makes it significantly deeper than related IQA models. Unique features of the proposed architecture are that: 1) with slight adaptations it can be used in a no-reference (NR) as well as in a full-reference (FR) IQA setting and 2) it allows for joint learning of local quality and local weights, i.e., relative importance of local quality to the global quality estimate, in an unified framework. Our approach is purely data-driven and does not rely on hand-crafted features or other types of prior domain knowledge about the human visual system or image statistics. We evaluate the proposed approach on the LIVE, CISQ, and TID2013 databases as well as the LIVE In the wild image quality challenge database and show superior performance to state-of-the-art NR and FR IQA methods. Finally, cross-database evaluation shows a high ability to generalize between different databases, indicating a high robustness of the learned features.

PDF Abstract

Deep Neural Networks for Image Quality Assessment

The paper "Deep Neural Networks for No-Reference and Full-Reference Image Quality Assessment" by Bosse et al. introduces an innovative approach to Image Quality Assessment (IQA) utilizing deep convolutional neural networks (CNNs). By leveraging the capacity of deep learning, this research aims to solve both Full-Reference (FR) and No-Reference (NR) IQA problems without relying on hand-crafted features or domain knowledge traditionally employed in IQA research.

Key Contributions

The primary contributions of this work are twofold:

Dual-Mode IQA Network: The proposed network architecture can operate in both FR and NR modes with minimal adjustments. 2. Unified Framework for Local Quality and Weight Learning: The approach allows for end-to-end learning of both local quality scores and the relative importance of these scores to the overall image quality estimate.

Network Architecture

The core architecture proposed for the IQA task is significantly deep, with:

10 convolutional layers and 5 pooling layers for feature extraction: This deep network ensures that complex and hierarchical image features can be captured.
2 fully connected layers for regression: These layers translate the extracted features into quality scores.

Feature Fusion Strategies

To facilitate FR IQA, the authors introduce a Siamese network structure where features from reference and distorted images are fused. Three fusion strategies were explored:

Difference (f_d - f_r)
Concatenation ([f_r, f_d])
Concatenation plus difference ([f_r, f_d, f_d - f_r])

The empirical results suggest that the third approach provides the best performance, likely because it allows the model to leverage both raw and difference features.

Spatial Pooling and Weighted Aggregation

Two spatial pooling methods are described:

Simple Averaging
Weighted Average Patch Aggregation: This innovative method introduces a parallel branch in the network to learn the local weights of image patches, facilitating a more perceptually accurate global quality estimation.

Evaluation and Results

The proposed methods were rigorously evaluated on several benchmark datasets including LIVE, TID2013, CSIQ, and CLIVE. The experimental results demonstrated that:

WaDIQaM-FR outperformed existing state-of-the-art FR IQA methods on both LIVE and TID2013.
DIQaM-NR, targeting NR IQA, showed superior performance to other NR methods on the LIVE and TID2013 datasets, though not always on CLIVE.
The patch-wise aggregation, especially when incorporating learned weights, significantly improved the quality prediction in settings with high spatial variability of distortions.

Implications and Future Work

The practical implications of this research are considerable:

Enhancing Multimedia Applications: Improved IQA methods can be adopted in video streaming, image compression, and transmission systems for better quality control.
Saliency Models: The interaction between the learned weights and image saliency suggests potential refinements where saliency maps could be co-learned within the same framework.
Domain Adaptation: By leveraging pre-trained models on auxiliary tasks, like those mimicking traditional IQMs, the dependency on large labeled datasets can be reduced.

Generalization and Robustness

One of the notable strengths of the proposed methods is their ability to generalize across different datasets, as evidenced by cross-database evaluations. Although WaDIQaM-FR and DIQaM-NR performed robustly, the paper underscores the necessity for larger and more diverse training datasets to further enhance these generalization capabilities.

Bridging the FR-NR Gap

Uniquely, this paper demonstrates that starting from a FR model, the approach can be systematically downgraded to NR IQA by reducing the information from the reference image. This exploration provides a step toward unified IQA models that can seamlessly transition between FR, RR (Reduced-Reference), and NR modes.

Conclusion

The research by Bosse et al. indicates substantial progress in the field of IQA via deep learning, with significant improvements in performance and robustness over conventional methods. By shifting towards purely data-driven models and refining network architectures for specific IQA sub-tasks, the future of IQA looks promising. Given these findings, further exploration into optimized network designs, leveraging even larger datasets, and integrating perceptual saliency maps could propel deep learning-based IQA to new heights.

PDF Markdown Bookmark Chat (Pro)

Authors (5)

Sebastian Bosse (11 papers)
Dominique Maniry (1 paper)
Klaus-Robert Müller (167 papers)
Thomas Wiegand (29 papers)
Wojciech Samek (144 papers)

Citations (928)

View on Semantic Scholar