2000 character limit reached
AIS 2024 Challenge on Video Quality Assessment of User-Generated Content: Methods and Results (2404.16205v1)
Published 24 Apr 2024 in cs.CV and cs.MM
Abstract: This paper reviews the AIS 2024 Video Quality Assessment (VQA) Challenge, focused on User-Generated Content (UGC). The aim of this challenge is to gather deep learning-based methods capable of estimating the perceptual quality of UGC videos. The user-generated videos from the YouTube UGC Dataset include diverse content (sports, games, lyrics, anime, etc.), quality and resolutions. The proposed methods must process 30 FHD frames under 1 second. In the challenge, a total of 102 participants registered, and 15 submitted code and models. The performance of the top-5 submissions is reviewed and provided here as a survey of diverse deep models for efficient video quality assessment of user-generated content.
- QoE Modeling for HTTP Adaptive Video Streaming–A Survey and Open Challenges. IEEE Access, 7:30831–30859, 2019.
- Real-time 4k super-resolution of compressed AVIF images. AIS 2024 challenge survey. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2024a.
- AIS 2024 challenge on video quality assessment of user-generated content: Methods and results. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2024b.
- RankDVQA: Deep vqa based on ranking-inspired hybrid training. arXiv preprint arXiv:2202.08595, 2022.
- Deepti Ghadiyaram. Perceptual quality prediction on authentically distorted images using a bag of features approach. Journal of Vision, 17(1)(32):1–25, 2017.
- Modular framework and instances of pixel-based video quality models for uhd-1/4k. IEEE Access, 9:31842–31864, 2021.
- KonVid-150k: A Dataset for No-Reference Video Quality Assessment of Videos in-the-Wild. In IEEE Access 9, pages 72139–72160. IEEE, 2021.
- Ntire 2022 challenge on perceptual image quality assessment. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 951–967, 2022.
- COVER: A comprehensive video quality evaluator. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2024.
- The konstanz natural video database (konvid-1k). In 2017 Ninth international conference on quality of multimedia experience (QoMEX), pages 1–6. IEEE, 2017.
- Vila: Learning image aesthetics from user comments with vision-language pretraining. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10041–10051, 2023.
- Jari Korhonen. Two-level approach for no-reference consumer video quality assessment. IEEE Trans. Image Process., 28(12):5923–5938, 2019.
- No-reference quality assessment of tone-mapped HDR pictures. IEEE Trans. Image Process., 26(6):2957–2971, 2017.
- Image quality assessment. https://github.com/idealo/image-quality-assessment, 2018.
- Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF international conference on computer vision, pages 10012–10022, 2021.
- A convnet for the 2020s. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 11976–11986, 2022.
- No-reference image quality assessment in the spatial domain. IEEE Trans. Image Process., 21(12):4695–4708, 2012.
- Making a “completely blind” image quality analyzer. IEEE Signal Processing Letters, 20(3):209–212, 2013.
- Netflix. VMAF - Video Multi-Method Assessment Fusion. https://github.com/Netflix/vmaf.
- Full-reference video quality assessment for user generated content transcoding. arXiv preprint arXiv:2312.12317, 2023.
- Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR, 2021.
- Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4510–4520, 2018.
- Large-scale study of perceptual video quality. IEEE Transactions on Image Processing, 28(2):612–627, 2018.
- Statista. Number of users of OTT video worldwide from 2020 to 2029 (in millions) [Graph]. https://www.statista.com/forecasts/1207843/ott-video-users-worldwide/, a.
- Statista. Daily time spent on social networking by internet users worldwide from 2012 to 2024 (in minutes) [Graph]. https://www.statista.com/statistics/433871/daily-social-media-usage-worldwide/, b.
- A deep learning based no-reference quality assessment model for ugc videos. In Proceedings of the 30th ACM International Conference on Multimedia, pages 856–865, 2022.
- Analysis of video quality datasets via design of minimalistic video quality models. arXiv preprint arXiv:2307.13981, 2023.
- UGC-VQA: Benchmarking blind video quality assessment for user generated content. IEEE Trans. Image Process., 30:4449–4464, 2021a.
- RAPIQUE: Rapid and accurate video quality prediction of user generated content. IEEE Open Journal of Signal Processing, 2:425–440, 2021b.
- Maxim: Multi-axis mlp for image processing. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5769–5780, 2022.
- Ndnetgaming-development of a no-reference deep cnn for gaming video quality prediction. Multimedia Tools and Applications, pages 1–23, 2022.
- Exploring clip for assessing the look and feel of images. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 2555–2563, 2023.
- Youtube ugc dataset for video compression research. In 2019 IEEE 21st International Workshop on Multimedia Signal Processing (MMSP), pages 1–5. IEEE, 2019.
- Rich features for perceptual quality assessment of ugc videos. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13435–13444, 2021.
- Event-Based Eye Tracking. AIS 2024 Challenge Survey. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2024.
- Modular Blind Video Quality Assessment, 2024.
- Fast-vqa: Efficient end-to-end video quality assessment with fragment sampling. In European conference on computer vision, pages 538–554. Springer, 2022.
- Neighbourhood representative sampling for efficient end-to-end video quality assessment. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023a.
- Exploring video quality assessment on user generated contents from aesthetic and technical perspectives. In International Conference on Computer Vision (ICCV), 2023b.
- Q-align: Teaching lmms for visual scoring via discrete text-defined levels. arXiv preprint arXiv:2312.17090, 2023c.
- Blind image quality assessment using joint statistics of gradient magnitude and laplacian features. IEEE Trans. Image Process., 23(11):4850–4862, 2014.
- Unsupervised feature learning framework for no-reference image quality assessment. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), pages 1098–1105, 2012.
- Live large-scale social video quality (lsvq) database. Online: https://github. com/baidut/PatchVQ, 2020.
- Patch-vq:’patching up’the video quality problem. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 14019–14029, 2021.
- Faver: Blind quality prediction of variable frame rate videos. Signal Processing: Image Communication, 122:117101, 2024.