Blind Image Quality Assessment Using A Deep Bilinear Convolutional Neural Network (1907.02665v1)

Published 5 Jul 2019 in eess.IV, cs.CV, and cs.MM

Abstract: We propose a deep bilinear model for blind image quality assessment (BIQA) that handles both synthetic and authentic distortions. Our model consists of two convolutional neural networks (CNN), each of which specializes in one distortion scenario. For synthetic distortions, we pre-train a CNN to classify image distortion type and level, where we enjoy large-scale training data. For authentic distortions, we adopt a pre-trained CNN for image classification. The features from the two CNNs are pooled bilinearly into a unified representation for final quality prediction. We then fine-tune the entire model on target subject-rated databases using a variant of stochastic gradient descent. Extensive experiments demonstrate that the proposed model achieves superior performance on both synthetic and authentic databases. Furthermore, we verify the generalizability of our method on the Waterloo Exploration Database using the group maximum differentiation competition.

PDF Abstract

Blind Image Quality Assessment Using a Deep Bilinear Convolutional Neural Network

The paper addresses the challenge of blind image quality assessment (BIQA) by proposing a Deep Bilinear Convolutional Neural Network (DB-CNN), designed to handle both synthetic and authentic image distortions. The innovative approach involves the use of two convolutional neural networks (CNNs) that specialize in distinct types of distortions. Synthetic distortions are tackled by pre-training a CNN on large-scale datasets to classify distortion types and levels, whereas a pre-trained CNN is used for handling authentic distortions that are more complex and varied.

Methodology

The proposed DB-CNN model employs a two-stream architecture aimed at modeling synthetic and authentic distortions as two-factor variations:

Synthetic Distortions Handling:
- The synthetic distortions stream (S-CNN) is trained using a dataset generated by merging the Waterloo Exploration Database and the PASCAL VOC Database, comprising over 850,000 distorted images.
- A CNN is pre-trained to classify distortion types and levels, facilitating robust initializations by leveraging labeled distortion data.
Authentic Distortions Handling:
- The authentic distortions stream utilizes a pre-trained VGG-16 network.
- This approach leverages high-level features relevant to authentic distortions often encountered in real-world photographic images.
Bilinear Pooling:
- DB-CNN efficiently combines the features from these streams using bilinear pooling.
- The bilinear representation offers a unified feature set which is then fine-tuned for quality prediction.

The entire DB-CNN model is optimized using a variant of stochastic gradient descent over subject-rated databases, demonstrating its applicability across different distortion scenarios.

Experimental Evaluation

DB-CNN's performance was evaluated on synthetic databases such as LIVE, CSIQ, and TID2013, as well as on the authentic LIVE Challenge Database. The results showcase:

Robust Performance: The model reported superior performance with high SRCC and PLCC scores across various distortion types and databases.
Cross-Database Generalizability: It exhibits considerable generalizability, outperforming current models even when tested in cross-database settings.
Resilience in Distortion Handling: DB-CNN handles both single and multiply distorted scenarios effectively, suggesting a robust framework applicable in varied contexts.
Waterloo Exploration Database Testing: Evaluated using D-Test, L-Test, and P-Test on Waterloo, DB-CNN demonstrated competitive discriminability and ranking consistency.

Implications and Future Directions

The application of DB-CNN establishes new possibilities for advancing BIQA through leveraging pre-training on synthetic datasets and pooling distinct feature sets. The fusion of both streams via bilinear pooling substantiates the model's practical utility, providing a comprehensive solution for diverse BIQA needs.

Future work could entail:

Expanding distortion types and datasets for pre-training to enhance generalization.
Exploring more advanced architectures like ResNet for potential performance gains.
Refining the model to unify the training for both synthetic and authentic distortions, enhancing its comprehensive applicability.

DB-CNN sets a foundational framework for BIQA that bridges the processing of disparate distortion types, thus propelling advancements in image quality assessment research.

PDF Markdown Bookmark Chat (Pro)

Authors (5)

Weixia Zhang (19 papers)
Kede Ma (57 papers)
Jia Yan (14 papers)
Dexiang Deng (2 papers)
Zhou Wang (98 papers)

Citations (573)

View on Semantic Scholar

Blind Image Quality Assessment Using A Deep Bilinear Convolutional Neural Network (1907.02665v1)