Papers
Topics
Authors
Recent
2000 character limit reached

Scaling and Masking: A New Paradigm of Data Sampling for Image and Video Quality Assessment (2401.02614v1)

Published 5 Jan 2024 in cs.CV and cs.MM

Abstract: Quality assessment of images and videos emphasizes both local details and global semantics, whereas general data sampling methods (e.g., resizing, cropping or grid-based fragment) fail to catch them simultaneously. To address the deficiency, current approaches have to adopt multi-branch models and take as input the multi-resolution data, which burdens the model complexity. In this work, instead of stacking up models, a more elegant data sampling method (named as SAMA, scaling and masking) is explored, which compacts both the local and global content in a regular input size. The basic idea is to scale the data into a pyramid first, and reduce the pyramid into a regular data dimension with a masking strategy. Benefiting from the spatial and temporal redundancy in images and videos, the processed data maintains the multi-scale characteristics with a regular input size, thus can be processed by a single-branch model. We verify the sampling method in image and video quality assessment. Experiments show that our sampling method can improve the performance of current single-branch models significantly, and achieves competitive performance to the multi-branch models without extra model complexity. The source code will be available at https://github.com/Sissuire/SAMA.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (54)
  1. Image and video compression standards: algorithms and architectures.
  2. Perceptual quality assessment of omnidirectional images. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, 580–588.
  3. Perceptual quality assessment of smartphone photography. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3677–3686.
  4. Masked autoencoders are scalable vision learners. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 16000–16009.
  5. The Konstanz natural video database (KoNViD-1k). In 2017 Ninth international conference on quality of multimedia experience (QoMEX), 1–6. IEEE.
  6. KonIQ-10k: An ecologically valid database for deep learning of blind image quality assessment. IEEE Transactions on Image Processing, 29: 4041–4056.
  7. Squeeze-and-excitation networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, 7132–7141.
  8. Single image super-resolution quality assessment: a real-world dataset, subjective studies, and an objective metric. IEEE Transactions on Image Processing, 31: 2279–2294.
  9. Musiq: Multi-scale image quality transformer. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 5148–5157.
  10. Fully deep blind image quality predictor. IEEE Journal of selected topics in signal processing, 11(1): 206–220.
  11. Deep CNN-based blind image quality predictor. IEEE transactions on neural networks and learning systems, 30(1): 11–24.
  12. Korhonen, J. 2019. Two-level approach for no-reference consumer video quality assessment. IEEE Transactions on Image Processing, 28(12): 5923–5938.
  13. Blindly assess quality of in-the-wild videos via quality-aware pre-training and motion perception. IEEE Transactions on Circuits and Systems for Video Technology, 32(9): 5944–5958.
  14. Quality assessment of in-the-wild videos. In Proceedings of the 27th ACM International Conference on Multimedia, 2351–2359.
  15. Which has better visual quality: The clear blue sky or a blurry animal? IEEE Transactions on Multimedia, 21(5): 1221–1234.
  16. Blind image quality index for authentic distortions with local and global deep feature aggregation. IEEE Transactions on Circuits and Systems for Video Technology, 32(12): 8512–8523.
  17. Quality Assessment of UGC Videos Based on Decomposition and Recomposition. IEEE Transactions on Circuits and Systems for Video Technology, 33(3): 1043–1054.
  18. Spatiotemporal representation learning for blind video quality assessment. IEEE Transactions on Circuits and Systems for Video Technology, 32(6): 3500–3513.
  19. Swin transformer v2: Scaling up capacity and resolution. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 12009–12019.
  20. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF international conference on computer vision, 10012–10022.
  21. Video swin transformer. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 3202–3211.
  22. Image quality assessment using contrastive learning. IEEE Transactions on Image Processing, 31: 4149–4161.
  23. Maschke, T. 2013. Digitale kameratechnik: technik digitaler kameras in theorie und praxis. Springer-Verlag.
  24. VCRNet: Visual compensation restoration network for no-reference image quality assessment. IEEE Transactions on Image Processing, 31: 1613–1627.
  25. DACNN: Blind image quality assessment via a distortion-aware convolutional neural network. IEEE Transactions on Circuits and Systems for Video Technology, 32(11): 7518–7531.
  26. Data-Efficient Image Quality Assessment with Attention-Panel Decoder. arXiv preprint arXiv:2304.04952.
  27. Blind image quality assessment: A natural scene statistics approach in the DCT domain. IEEE transactions on Image Processing, 21(8): 3339–3352.
  28. Blind prediction of natural video quality. IEEE Transactions on image Processing, 23(3): 1352–1365.
  29. Re-IQA: Unsupervised Learning for Image Quality Assessment in the Wild. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5846–5855.
  30. Large-scale study of perceptual video quality. IEEE Transactions on Image Processing, 28(2): 612–627.
  31. Blind image quality assessment for authentic distortions by intermediary enhancement and iterative training. IEEE Transactions on Circuits and Systems for Video Technology, 32(11): 7592–7604.
  32. Blindly assess image quality in the wild guided by a self-adaptive hyper network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3667–3676.
  33. UGC-VQA: Benchmarking blind video quality assessment for user generated content. IEEE Transactions on Image Processing, 30: 4449–4464.
  34. Exploring clip for assessing the look and feel of images. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, 2555–2563.
  35. YouTube UGC dataset for video compression research. In 2019 IEEE 21st International Workshop on Multimedia Signal Processing (MMSP), 1–5. IEEE.
  36. Rich features for perceptual quality assessment of UGC videos. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 13435–13444.
  37. Fast-vqa: Efficient end-to-end video quality assessment with fragment sampling. In European Conference on Computer Vision, 538–554. Springer.
  38. Neighbourhood representative sampling for efficient end-to-end video quality assessment. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1–17.
  39. DisCoVQA: Temporal distortion-content transformers for video quality assessment. IEEE Transactions on Circuits and Systems for Video Technology, 1–1.
  40. Exploring opinion-unaware video quality assessment with semantic affinity criterion. In Processings of International Conference on Multimedia and Expo (ICME).
  41. Towards robust text-prompted semantic criterion for in-the-wild video quality assessment. arXiv:2304.14672.
  42. Exploring video quality assessment on user generated contents from aesthetic and technical perspectives. In IEEE International Conference on Computer Vision, 1–8.
  43. Towards explainable video quality assessment: A database and a language-prompted approach. In Proceedings of the 31st ACM International Conference on Multimedia (ACM MM).
  44. Quality assessment for video with degradation along salient trajectories. IEEE Transactions on Multimedia, 21(11): 2738–2749.
  45. End-to-end blind image quality prediction with cascaded deep neural network. IEEE Transactions on image processing, 29: 7414–7426.
  46. No-reference image quality assessment with visual pattern degradation. Information sciences, 504: 487–500.
  47. Patch-VQ:’Patching Up’the video quality problem. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 14019–14029.
  48. From patches to pictures (PaQ-2-PiQ): Mapping the perceptual space of picture quality. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3575–3585.
  49. Blind image quality assessment via vision-language correspondence: A multitask learning perspective. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 14071–14081.
  50. MD-VQA: Multi-dimensional quality assessment for UGC live videos. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1746–1755.
  51. Quality-aware pre-trained models for blind image quality assessment. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 22302–22313.
  52. Zoom-VQA: Patches, Frames and Clips Integration for Video Quality Assessment. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1302–1310.
  53. MetaIQA: Deep meta-learning for no-reference image quality assessment. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 14143–14152.
  54. Blind Image Quality Assessment Via Cross-View Consistency. IEEE Transactions on Multimedia, 1–14.
Citations (6)

Summary

We haven't generated a summary for this paper yet.

Whiteboard

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

GitHub

Tweets

Sign up for free to view the 1 tweet with 0 likes about this paper.