You Only Train Once: A Unified Framework for Both Full-Reference and No-Reference Image Quality Assessment (2310.09560v2)
Abstract: Although recent efforts in image quality assessment (IQA) have achieved promising performance, there still exists a considerable gap compared to the human visual system (HVS). One significant disparity lies in humans' seamless transition between full reference (FR) and no reference (NR) tasks, whereas existing models are constrained to either FR or NR tasks. This disparity implies the necessity of designing two distinct systems, thereby greatly diminishing the model's versatility. Therefore, our focus lies in unifying FR and NR IQA under a single framework. Specifically, we first employ an encoder to extract multi-level features from input images. Then a Hierarchical Attention (HA) module is proposed as a universal adapter for both FR and NR inputs to model the spatial distortion at each encoder stage. Furthermore, considering that different distortions contaminate encoder stages and damage image semantic meaning differently, a Semantic Distortion Aware (SDA) module is proposed to examine feature correlations between shallow and deep layers of the encoder. By adopting HA and SDA, the proposed network can effectively perform both FR and NR IQA. When our proposed model is independently trained on NR or FR IQA tasks, it outperforms existing models and achieves state-of-the-art performance. Moreover, when trained jointly on NR and FR IQA tasks, it further enhances the performance of NR IQA while achieving on-par performance in the state-of-the-art FR IQA. You only train once to perform both IQA tasks. Code will be released at: https://github.com/BarCodeReader/YOTO.
- A full-reference image quality assessment method with saliency and error feature fusion. In 2022 IEEE International Symposium on Circuits and Systems (ISCAS), pages 3165–3169, 2022.
- Audio-visual multimedia quality assessment: A comprehensive survey. IEEE Access, 5:21090–21117, 2017.
- Deep neural networks for no-reference and full-reference image quality assessment. IEEE Trans. Image Process., 27(1):206–219, 2018.
- No-reference image quality assessment by hallucinating pristine features. IEEE Transactions on Image Processing, 31:6139–6151, 2022.
- Crossvit: Cross-attention multi-scale vision transformer for image classification. In 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, October 10-17, 2021, pages 347–356. IEEE, 2021.
- No-reference image quality assessment: An attention driven approach. IEEE Transactions on Image Processing, 29:6496–6506, 2020.
- Perceptual image quality assessment with transformers. In IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2021, virtual, June 19-25, 2021, pages 433–442. Computer Vision Foundation / IEEE, 2021.
- Perceptual quality assessment for screen content images by spatial continuity. IEEE Trans. Circuits Syst. Video Technol., 30(11):4050–4063, 2020.
- No reference quality assessment for screen content images with both local and global feature representation. IEEE Trans. Image Process., 27(4):1600–1610, 2018.
- Screen content image quality assessment using multi-scale difference of gaussian. IEEE Trans. Circuits Syst. Video Technol., 28(9):2428–2432, 2018.
- Massive online crowdsourced study of subjective and objective picture quality. IEEE Trans. Image Process., 25(1):372–387, 2016.
- Bernd Girod. What’s Wrong with Mean-Squared Error?, page 207–220. MIT Press, Cambridge, MA, USA, 1993.
- No-reference image quality assessment via transformers, relative ranking, and self-consistency. In IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2022, Waikoloa, HI, USA, January 3-8, 2022, pages 3989–3999. IEEE, 2022.
- PIPAL: A large-scale image quality assessment dataset for perceptual image restoration. In Andrea Vedaldi, Horst Bischof, Thomas Brox, and Jan-Michael Frahm, editors, Computer Vision - ECCV 2020 - 16th European Conference, Glasgow, UK, August 23-28, 2020, Proceedings, Part XI, volume 12356 of Lecture Notes in Computer Science, pages 633–651. Springer, 2020.
- Evaluating quality of screen content images via structural variation analysis. IEEE Trans. Vis. Comput. Graph., 24(10):2689–2701, 2018.
- Learning a blind quality evaluation engine of screen content images. Neurocomputing, 196:140–149, 2016.
- Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27-30, 2016, pages 770–778. IEEE Computer Society, 2016.
- Koniq-10k: An ecologically valid database for deep learning of blind image quality assessment. IEEE Trans. Image Process., 29:4041–4056, 2020.
- No-reference screen content image quality assessment based on multi-region features. Neurocomputing, 386:30–41, 2020.
- Sameeulla Khan. Full reference quality assessment of full hd images using combined saliency priors in multi-scale. In 2018 Twenty Fourth National Conference on Communications (NCC), pages 1–5, 2018.
- Deep learning of human visual sensitivity in image quality assessment framework. In 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21-26, 2017, pages 1969–1977. IEEE Computer Society, 2017.
- Fully deep blind image quality predictor. IEEE J. Sel. Top. Signal Process., 11(1):206–220, 2017.
- Adam: A method for stochastic optimization. In Yoshua Bengio and Yann LeCun, editors, 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, 2015.
- Quality prediction on deep generative images. IEEE Transactions on Image Processing, 29:5964–5979, 2020.
- Attentions help cnns see better: Attention-based hybrid image quality assessment network. In IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2022, New Orleans, LA, USA, June 19-20, 2022, pages 1139–1148. IEEE, 2022.
- Most apparent distortion: Full-reference image quality assessment and the role of strategy. J. Electronic Imaging, 19:011006, 01 2010.
- Stacked cross attention for image-text matching. In Vittorio Ferrari, Martial Hebert, Cristian Sminchisescu, and Yair Weiss, editors, Computer Vision - ECCV 2018 - 15th European Conference, Munich, Germany, September 8-14, 2018, Proceedings, Part IV, volume 11208 of Lecture Notes in Computer Science, pages 212–228. Springer, 2018.
- AIMQ: a methodology for information quality assessment. Inf. Manag., 40(2):133–146, 2002.
- Mmmnet: An end-to-end multi-task deep convolution neural network with multi-scale and multi-hierarchy fusion for blind image quality assessment. IEEE Transactions on Circuits and Systems for Video Technology, 31(12):4798–4811, 2021.
- CAT: cross attention in vision transformer. In IEEE International Conference on Multimedia and Expo, ICME 2022, Taipei, Taiwan, July 18-22, 2022, pages 1–6. IEEE, 2022.
- Kadid-10k: A large-scale artificially distorted IQA database. In 11th International Conference on Quality of Multimedia Experience QoMEX 2019, Berlin, Germany, June 5-7, 2019, pages 1–3. IEEE, 2019.
- Hallucinated-iqa: No-reference image quality assessment via adversarial learning. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18-22, 2018, pages 732–741. Computer Vision Foundation / IEEE Computer Society, 2018.
- Weisi Lin and C.-C. Jay Kuo. Perceptual visual quality metrics: A survey. Journal of Visual Communication and Image Representation, 22(4):297–312, 2011.
- A multiscale approach to deep blind image quality assessment. IEEE Transactions on Image Processing, 32:1656–1667, 2023.
- Rankiqa: Learning from rankings for no-reference image quality assessment. In IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, October 22-29, 2017, pages 1040–1049. IEEE Computer Society, 2017.
- Swin transformer: Hierarchical vision transformer using shifted windows. In 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, October 10-17, 2021, pages 9992–10002. IEEE, 2021.
- Veri-wild: A large dataset and a new method for vehicle re-identification in the wild. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16-20, 2019, pages 3235–3243. Computer Vision Foundation / IEEE, 2019.
- Blind image quality assessment with active inference. IEEE Trans. Image Process., 30:3650–3663, 2021.
- End-to-end blind image quality assessment using deep neural networks. IEEE Trans. Image Process., 27(3):1202–1213, 2018.
- Image quality assessment using contrastive learning. IEEE Transactions on Image Processing, 31:4149–4161, 2022.
- Blind quality assessment based on pseudo-reference image. IEEE Transactions on Multimedia, 20(8):2049–2062, 2018.
- Screen content quality assessment: Overview, benchmark, and beyond. ACM Comput. Surv., 54(9), oct 2021.
- Blind image quality estimation via distortion aggravation. IEEE Transactions on Broadcasting, 64(2):508–517, 2018.
- Study of subjective and objective quality assessment of audio-visual signals. IEEE Trans. Image Process., 29:6054–6068, 2020.
- A multimodal saliency model for videos with high audio-visual correspondence. IEEE Trans. Image Process., 29:3805–3819, 2020.
- No-reference image quality assessment in the spatial domain. IEEE Trans. Image Process., 21(12):4695–4708, 2012.
- A novel rank learning based no-reference image quality assessment method. IEEE Transactions on Multimedia, 24:4197–4211, 2022.
- Dacnn: Blind image quality assessment via a distortion-aware convolutional neural network. IEEE Transactions on Circuits and Systems for Video Technology, 32(11):7518–7531, 2022.
- Image quality assessment using human visual DOG model fused with random forest. IEEE Trans. Image Process., 24(11):3282–3292, 2015.
- Image database TID2013: peculiarities, results and perspectives. Signal Process. Image Commun., 30:57–77, 2015.
- Pieapp: Perceptual image-error assessment through pairwise preference. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18-22, 2018, pages 1808–1817. Computer Vision Foundation / IEEE Computer Society, 2018.
- Blind image quality assessment: A natural scene statistics approach in the DCT domain. IEEE Trans. Image Process., 21(8):3339–3352, 2012.
- A novel just-noticeable-difference-based saliency-channel attention residual network for full-reference image quality predictions. IEEE Trans. Circuits Syst. Video Technol., 31(7):2602–2616, 2021.
- Live image quality assessment database release 2. , 2015.
- Blindly assess image quality in the wild guided by a self-adaptive hyper network. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13-19, 2020, pages 3664–3673. Computer Vision Foundation / IEEE, 2020.
- Blind quality assessment for in-the-wild images via hierarchical feature fusion strategy. In 2022 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB), pages 01–06, 2022.
- Domonkos Varga. Composition-preserving deep approach to full-reference image quality assessment. Signal Image Video Process., 14(6):1265–1272, 2020.
- Attention is all you need. In Isabelle Guyon, Ulrike von Luxburg, Samy Bengio, Hanna M. Wallach, Rob Fergus, S. V. N. Vishwanathan, and Roman Garnett, editors, Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, pages 5998–6008, 2017.
- MSTRIQ: no reference image quality assessment based on swin transformer with multi-stage fusion. In IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2022, New Orleans, LA, USA, June 19-20, 2022, pages 1268–1277. IEEE, 2022.
- Screen content image quality assessment with edge features in gradient domain. IEEE Access, 7:5285–5295, 2019.
- Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process., 13(4):600–612, 2004.
- Deep blind image quality assessment powered by online hard example mining. IEEE Transactions on Multimedia, pages 1–11, 2023.
- Multi-modality cross attention network for image and sentence matching. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13-19, 2020, pages 10938–10947. Computer Vision Foundation / IEEE, 2020.
- End-to-end blind image quality prediction with cascaded deep neural network. IEEE Transactions on Image Processing, 29:7414–7426, 2020.
- Sgdnet: An end-to-end saliency-guided deep neural network for no-reference image quality assessment. In Proceedings of the 27th ACM International Conference on Multimedia, MM ’19, page 1383–1391, New York, NY, USA, 2019. Association for Computing Machinery.
- MANIQA: multi-dimension attention network for no-reference image quality assessment. In IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2022, New Orleans, LA, USA, June 19-20, 2022, pages 1190–1199. IEEE, 2022.
- From patches to pictures (paq-2-piq): Mapping the perceptual space of picture quality. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13-19, 2020, pages 3572–3582. Computer Vision Foundation / IEEE, 2020.
- Transformer for image quality assessment. In 2021 IEEE International Conference on Image Processing, ICIP 2021, Anchorage, AK, USA, September 19-22, 2021, pages 1389–1393. IEEE, 2021.
- Perceptual image quality assessment: a survey. Sci. China Inf. Sci., 63(11), 2020.
- A feature-enriched completely blind image quality evaluator. IEEE Trans. Image Process., 24(8):2579–2591, 2015.
- Non-shift edge based ratio (NSER): an image quality assessment metric based on early vision features. IEEE Signal Process. Lett., 18(5):315–318, 2011.
- Blind image quality assessment using a deep bilinear convolutional neural network. IEEE Trans. Circuits Syst. Video Technol., 30(1):36–47, 2020.
- Saliency-based image quality assessment metric. In 2016 3rd International Conference on Systems and Informatics (ICSAI), pages 918–924, 2016.
- Metaiqa: Deep meta-learning for no-reference image quality assessment. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13-19, 2020, pages 14131–14140. Computer Vision Foundation / IEEE, 2020.
- Blind image quality assessment via cross-view consistency. IEEE Transactions on Multimedia, pages 1–14, 2022.
- Traffic-sign detection and classification in the wild. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27-30, 2016, pages 2110–2118. IEEE Computer Society, 2016.