Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

AIS 2024 Challenge on Video Quality Assessment of User-Generated Content: Methods and Results (2404.16205v1)

Published 24 Apr 2024 in cs.CV and cs.MM

Abstract: This paper reviews the AIS 2024 Video Quality Assessment (VQA) Challenge, focused on User-Generated Content (UGC). The aim of this challenge is to gather deep learning-based methods capable of estimating the perceptual quality of UGC videos. The user-generated videos from the YouTube UGC Dataset include diverse content (sports, games, lyrics, anime, etc.), quality and resolutions. The proposed methods must process 30 FHD frames under 1 second. In the challenge, a total of 102 participants registered, and 15 submitted code and models. The performance of the top-5 submissions is reviewed and provided here as a survey of diverse deep models for efficient video quality assessment of user-generated content.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (45)
  1. QoE Modeling for HTTP Adaptive Video Streaming–A Survey and Open Challenges. IEEE Access, 7:30831–30859, 2019.
  2. Real-time 4k super-resolution of compressed AVIF images. AIS 2024 challenge survey. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2024a.
  3. AIS 2024 challenge on video quality assessment of user-generated content: Methods and results. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2024b.
  4. RankDVQA: Deep vqa based on ranking-inspired hybrid training. arXiv preprint arXiv:2202.08595, 2022.
  5. Deepti Ghadiyaram. Perceptual quality prediction on authentically distorted images using a bag of features approach. Journal of Vision, 17(1)(32):1–25, 2017.
  6. Modular framework and instances of pixel-based video quality models for uhd-1/4k. IEEE Access, 9:31842–31864, 2021.
  7. KonVid-150k: A Dataset for No-Reference Video Quality Assessment of Videos in-the-Wild. In IEEE Access 9, pages 72139–72160. IEEE, 2021.
  8. Ntire 2022 challenge on perceptual image quality assessment. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 951–967, 2022.
  9. COVER: A comprehensive video quality evaluator. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2024.
  10. The konstanz natural video database (konvid-1k). In 2017 Ninth international conference on quality of multimedia experience (QoMEX), pages 1–6. IEEE, 2017.
  11. Vila: Learning image aesthetics from user comments with vision-language pretraining. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10041–10051, 2023.
  12. Jari Korhonen. Two-level approach for no-reference consumer video quality assessment. IEEE Trans. Image Process., 28(12):5923–5938, 2019.
  13. No-reference quality assessment of tone-mapped HDR pictures. IEEE Trans. Image Process., 26(6):2957–2971, 2017.
  14. Image quality assessment. https://github.com/idealo/image-quality-assessment, 2018.
  15. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF international conference on computer vision, pages 10012–10022, 2021.
  16. A convnet for the 2020s. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 11976–11986, 2022.
  17. No-reference image quality assessment in the spatial domain. IEEE Trans. Image Process., 21(12):4695–4708, 2012.
  18. Making a “completely blind” image quality analyzer. IEEE Signal Processing Letters, 20(3):209–212, 2013.
  19. Netflix. VMAF - Video Multi-Method Assessment Fusion. https://github.com/Netflix/vmaf.
  20. Full-reference video quality assessment for user generated content transcoding. arXiv preprint arXiv:2312.12317, 2023.
  21. Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR, 2021.
  22. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4510–4520, 2018.
  23. Large-scale study of perceptual video quality. IEEE Transactions on Image Processing, 28(2):612–627, 2018.
  24. Statista. Number of users of OTT video worldwide from 2020 to 2029 (in millions) [Graph]. https://www.statista.com/forecasts/1207843/ott-video-users-worldwide/, a.
  25. Statista. Daily time spent on social networking by internet users worldwide from 2012 to 2024 (in minutes) [Graph]. https://www.statista.com/statistics/433871/daily-social-media-usage-worldwide/, b.
  26. A deep learning based no-reference quality assessment model for ugc videos. In Proceedings of the 30th ACM International Conference on Multimedia, pages 856–865, 2022.
  27. Analysis of video quality datasets via design of minimalistic video quality models. arXiv preprint arXiv:2307.13981, 2023.
  28. UGC-VQA: Benchmarking blind video quality assessment for user generated content. IEEE Trans. Image Process., 30:4449–4464, 2021a.
  29. RAPIQUE: Rapid and accurate video quality prediction of user generated content. IEEE Open Journal of Signal Processing, 2:425–440, 2021b.
  30. Maxim: Multi-axis mlp for image processing. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5769–5780, 2022.
  31. Ndnetgaming-development of a no-reference deep cnn for gaming video quality prediction. Multimedia Tools and Applications, pages 1–23, 2022.
  32. Exploring clip for assessing the look and feel of images. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 2555–2563, 2023.
  33. Youtube ugc dataset for video compression research. In 2019 IEEE 21st International Workshop on Multimedia Signal Processing (MMSP), pages 1–5. IEEE, 2019.
  34. Rich features for perceptual quality assessment of ugc videos. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13435–13444, 2021.
  35. Event-Based Eye Tracking. AIS 2024 Challenge Survey. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2024.
  36. Modular Blind Video Quality Assessment, 2024.
  37. Fast-vqa: Efficient end-to-end video quality assessment with fragment sampling. In European conference on computer vision, pages 538–554. Springer, 2022.
  38. Neighbourhood representative sampling for efficient end-to-end video quality assessment. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023a.
  39. Exploring video quality assessment on user generated contents from aesthetic and technical perspectives. In International Conference on Computer Vision (ICCV), 2023b.
  40. Q-align: Teaching lmms for visual scoring via discrete text-defined levels. arXiv preprint arXiv:2312.17090, 2023c.
  41. Blind image quality assessment using joint statistics of gradient magnitude and laplacian features. IEEE Trans. Image Process., 23(11):4850–4862, 2014.
  42. Unsupervised feature learning framework for no-reference image quality assessment. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), pages 1098–1105, 2012.
  43. Live large-scale social video quality (lsvq) database. Online: https://github. com/baidut/PatchVQ, 2020.
  44. Patch-vq:’patching up’the video quality problem. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 14019–14029, 2021.
  45. Faver: Blind quality prediction of variable frame rate videos. Signal Processing: Image Communication, 122:117101, 2024.
Citations (4)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com