Exploring Video Quality Assessment on User Generated Contents from Aesthetic and Technical Perspectives (2211.04894v3)

Published 9 Nov 2022 in cs.CV, cs.LG, cs.MM, and eess.IV

Abstract: The rapid increase in user-generated-content (UGC) videos calls for the development of effective video quality assessment (VQA) algorithms. However, the objective of the UGC-VQA problem is still ambiguous and can be viewed from two perspectives: the technical perspective, measuring the perception of distortions; and the aesthetic perspective, which relates to preference and recommendation on contents. To understand how these two perspectives affect overall subjective opinions in UGC-VQA, we conduct a large-scale subjective study to collect human quality opinions on overall quality of videos as well as perceptions from aesthetic and technical perspectives. The collected Disentangled Video Quality Database (DIVIDE-3k) confirms that human quality opinions on UGC videos are universally and inevitably affected by both aesthetic and technical perspectives. In light of this, we propose the Disentangled Objective Video Quality Evaluator (DOVER) to learn the quality of UGC videos based on the two perspectives. The DOVER proves state-of-the-art performance in UGC-VQA under very high efficiency. With perspective opinions in DIVIDE-3k, we further propose DOVER++, the first approach to provide reliable clear-cut quality evaluations from a single aesthetic or technical perspective. Code at https://github.com/VQAssessment/DOVER.

Citations (91)

View on Semantic Scholar

Summary

The paper introduces distinct technical and aesthetic perspectives to comprehensively assess user-generated video quality.
It employs the large-scale DIVIDE-3k dataset with 3,590 videos and 450,000 human opinions to underpin robust subjective evaluations.
Novel models DOVER and DOVER++ effectively disentangle quality issues, outperforming state-of-the-art techniques on key UGC benchmarks.

Insights into Video Quality Assessment for User-Generated Content

The paper "Exploring Video Quality Assessment on User Generated Contents from Aesthetic and Technical Perspectives" seeks to explore the complex domain of Video Quality Assessment (VQA) specifically targeted at User-Generated Content (UGC). This paper arises from the increasing prevalence of UGC videos and the need for effective algorithms to assess their quality. The objective of this research is unique as it distinguishes video quality from two main perspectives: technical and aesthetic.

Key Contributions

Distinct Problem Perspectives: The research identifies two distinct perspectives in VQA for UGC. The technical perspective focuses on the measurement of distortions such as blurs and artifacts, which are common issues in videos due to varying capture and compression standards. The aesthetic perspective, on the other hand, deals with content and composition preferences, which are significantly influenced by semantics rather than technical attributes.
The DIVIDE-3k Database: One of the major contributions is the construction of the DIVIDE-3k dataset. This dataset incorporates a large scale of subjective studies to record human perception of video quality focusing on both perspectives. It includes 3,590 diverse UGC videos with 450,000 human opinions, providing a robust foundation to paper the impact of aesthetic and technical factors on video quality assessment.
Development of DOVER and DOVER++: The paper introduces two novel VQA models, the Disentangled Objective Video Quality Evaluator (DOVER) and its enhanced version DOVER++. DOVER is designed to learn video quality by evaluating both identified perspectives separately, showing superior performance by new state-of-the-art metrics. DOVER++ refines this by allowing evaluations on single perspectives, efficiently annotating whether quality issues are of aesthetic or technical origin.
Experimental Evaluations: Evaluations demonstrated that the proposed methods outperform existing state-of-the-art approaches on several UGC-VQA datasets, including LSVQ, KoNViD-1k, and YouTube-UGC. Notably, the results highlight the importance of considering both aesthetic and technical perspectives for a comprehensive assessment of UGC videos.
Advanced Supervision Strategy: The paper devises a limited view biased supervision strategy that leverages overall quality opinions to guide each branch of the evaluation, addressing the challenges of video quality assessment from separate perspectives effectively.

Implications and Future Directions

The findings of this paper suggest a broader understanding of how viewers perceive video quality, emphasizing the distinct roles of technical and aesthetic factors. This delineation can enhance algorithm design in various domains, such as content recommendation and digital rights management, where understanding viewer perception in terms of both clarity and content preference is crucial. Furthermore, the methodologies developed can extend to other modalities requiring disentangled perceptual evaluations, encouraging further research into nuanced quality assessments across different media.

The paper also alludes to potential user-centric applications, suggesting that these evaluation models could be adapted for personalized content delivery, considering individual preferences for aesthetic or technical quality. Future work could explore how these perspectives interact dynamically and how personalized feedback loops might refine automated quality assessments further. Moreover, the combination of these perspectives and the ensuing evaluation models provide a fresh direction in improving machine learning frameworks for better alignment with human-like video quality judgments.

PDF Markdown

Related Papers

GitHub

GitHub - VQAssessment/DOVER: [ICCV 2023, Official Code] for paper "Exploring Video Quality Assessment on User Generated Contents from Aesthetic and Technical Perspectives". Official Weights and Demos provided. (224 stars)