Transformer-based No-Reference Image Quality Assessment via Supervised Contrastive Learning (2312.06995v1)
Abstract: Image Quality Assessment (IQA) has long been a research hotspot in the field of image processing, especially No-Reference Image Quality Assessment (NR-IQA). Due to the powerful feature extraction ability, existing Convolution Neural Network (CNN) and Transformers based NR-IQA methods have achieved considerable progress. However, they still exhibit limited capability when facing unknown authentic distortion datasets. To further improve NR-IQA performance, in this paper, a novel supervised contrastive learning (SCL) and Transformer-based NR-IQA model SaTQA is proposed. We first train a model on a large-scale synthetic dataset by SCL (no image subjective score is required) to extract degradation features of images with various distortion types and levels. To further extract distortion information from images, we propose a backbone network incorporating the Multi-Stream Block (MSB) by combining the CNN inductive bias and Transformer long-term dependence modeling capability. Finally, we propose the Patch Attention Block (PAB) to obtain the final distorted image quality score by fusing the degradation features learned from contrastive learning with the perceptual distortion information extracted by the backbone network. Experimental results on seven standard IQA datasets show that SaTQA outperforms the state-of-the-art methods for both synthetic and authentic datasets. Code is available at https://github.com/I2-Multimedia-Lab/SaTQA
- Deep neural networks for no-reference and full-reference image quality assessment. IEEE Transactions on image processing, 27(1): 206–219.
- A simple framework for contrastive learning of visual representations. In International conference on machine learning, 1597–1607. PMLR.
- An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929.
- Discriminative unsupervised feature learning with convolutional neural networks. Advances in neural information processing systems, 27.
- No-reference quality assessment of contrast-distorted images based on natural scene statistics. IEEE Signal Processing Letters, 22(7): 838–842.
- Universal blind image quality assessment metrics via natural scene statistics and multiple kernel learning. IEEE Transactions on neural networks and learning systems, 24(12).
- Massive online crowdsourced study of subjective and objective picture quality. IEEE Transactions on Image Processing, 25(1): 372–387.
- Unsupervised representation learning by predicting image rotations. arXiv preprint arXiv:1803.07728.
- No-reference image quality assessment via transformers, relative ranking, and self-consistency. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 1220–1230.
- Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 9729–9738.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, 770–778.
- KonIQ-10k: An ecologically valid database for deep learning of blind image quality assessment. IEEE Transactions on Image Processing, 29: 4041–4056.
- Musiq: Multi-scale image quality transformer. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 5148–5157.
- Supervised contrastive learning. Advances in Neural Information Processing Systems, 33: 18661–18673.
- Fully deep blind image quality predictor. IEEE Journal of selected topics in signal processing, 11(1): 206–220.
- Most apparent distortion: full-reference image quality assessment and the role of strategy. Journal of electronic imaging, 19(1): 011006.
- No-reference image blur assessment based on discrete orthogonal moments. IEEE transactions on cybernetics, 46(1): 39–50.
- Referenceless measure of blocking artifacts by Tchebichef kernel analysis. IEEE Signal Processing Letters, 21(1): 122–125.
- KADID-10k: A large-scale artificially distorted IQA database. In 2019 Eleventh International Conference on Quality of Multimedia Experience (QoMEX), 1–3. IEEE.
- Weak supervision for deep IQA feature learning. arXiv 2020. arXiv preprint arXiv:2001.08113.
- Hallucinated-IQA: No-reference image quality assessment via adversarial learning. In Proceedings of the IEEE conference on computer vision and pattern recognition, 732–741.
- A no-reference metric for perceived ringing artifacts in images. IEEE Transactions on Circuits and Systems for Video Technology, 20(4): 529–539.
- Exploiting unlabeled data in cnns by self-supervised learning to rank. IEEE transactions on pattern analysis and machine intelligence, 41(8): 1862–1878.
- End-to-end blind image quality assessment using deep neural networks. IEEE Transactions on Image Processing, 27(3): 1202–1213.
- Image quality assessment using contrastive learning. IEEE Transactions on Image Processing, 31: 4149–4161.
- Comparison of four subjective methods for image quality assessment. In Computer graphics forum, volume 31, 2478–2491. Wiley Online Library.
- No-reference image quality assessment in the spatial domain. IEEE Transactions on image processing, 21(12): 4695–4708.
- A two-step framework for constructing blind image quality indices. IEEE Signal processing letters, 17(5): 513–516.
- Context encoders: Feature learning by inpainting. In Proceedings of the IEEE conference on computer vision and pattern recognition, 2536–2544.
- Image database TID2013: Peculiarities, results and perspectives. Signal processing: Image communication, 30: 57–77.
- Reduced-reference image quality assessment by structural similarity estimation. IEEE transactions on image processing, 21(8): 3378–3389.
- Blind image quality assessment: A natural scene statistics approach in the DCT domain. IEEE transactions on Image Processing, 21(8): 3339–3352.
- Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision, 618–626.
- A statistical evaluation of recent full reference image quality assessment algorithms. IEEE Transactions on image processing, 15(11): 3440–3451.
- Blindly assess image quality in the wild guided by a self-adaptive hyper network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3667–3676.
- Visualizing data using t-SNE. Journal of machine learning research, 9(11).
- Attention is all you need. Advances in neural information processing systems, 30.
- Blind quality metric of DIBR-synthesized images in the discrete wavelet transform domain. IEEE Transactions on Image Processing, 29: 1802–1814.
- Modern image quality assessment. Synthesis Lectures on Image, Video, and Multimedia Processing, 2(1): 1–156.
- Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing, 13(4): 600–612.
- Cbam: Convolutional block attention module. In Proceedings of the European conference on computer vision (ECCV), 3–19.
- Blind image quality assessment based on high order statistics aggregation. IEEE Transactions on Image Processing, 25(9): 4444–4457.
- Learning without human scores for blind image quality assessment. In Proceedings of the IEEE conference on computer vision and pattern recognition, 995–1002.
- Objective quality assessment method of stereo images. In 2009 3DTV Conference: The True Vision-Capture, Transmission and Display of 3D Video, 1–4. IEEE.
- MANIQA: Multi-dimension Attention Network for No-Reference Image Quality Assessment. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1191–1200.
- No-reference image quality assessment using visual codebooks. IEEE Transactions on Image Processing, 21(7): 3129–3138.
- Unsupervised feature learning framework for no-reference image quality assessment. In 2012 IEEE conference on computer vision and pattern recognition, 1098–1105. IEEE.
- From patches to pictures (PaQ-2-PiQ): Mapping the perceptual space of picture quality. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3575–3585.
- Transformer for image quality assessment. In 2021 IEEE International Conference on Image Processing (ICIP), 1389–1393. IEEE.
- A probabilistic quality representation approach to deep blind image quality prediction. arXiv preprint arXiv:1708.08190.
- A feature-enriched completely blind image quality evaluator. IEEE Transactions on Image Processing, 24(8): 2579–2591.
- SOM: Semantic obviousness metric for image quality assessment. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2394–2402.
- Blind image quality assessment using a deep bilinear convolutional neural network. IEEE Transactions on Circuits and Systems for Video Technology, 30(1): 36–47.
- MetaIQA: Deep meta-learning for no-reference image quality assessment. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 14143–14152.
- Jinsong Shi (3 papers)
- Pan Gao (47 papers)
- Jie Qin (68 papers)