YouTube UGC Dataset for Video Compression Research (1904.06457v2)

Published 13 Apr 2019 in cs.MM and eess.IV

Abstract: Non-professional video, commonly known as User Generated Content (UGC) has become very popular in today's video sharing applications. However, traditional metrics used in compression and quality assessment, like BD-Rate and PSNR, are designed for pristine originals. Thus, their accuracy drops significantly when being applied on non-pristine originals (the majority of UGC). Understanding difficulties for compression and quality assessment in the scenario of UGC is important, but there are few public UGC datasets available for research. This paper introduces a large scale UGC dataset (1500 20 sec video clips) sampled from millions of YouTube videos. The dataset covers popular categories like Gaming, Sports, and new features like High Dynamic Range (HDR). Besides a novel sampling method based on features extracted from encoding, challenges for UGC compression and quality evaluation are also discussed. Shortcomings of traditional reference-based metrics on UGC are addressed. We demonstrate a promising way to evaluate UGC quality by no-reference objective quality metrics, and evaluate the current dataset with three no-reference metrics (Noise, Banding, and SLEEQ).

Citations (206)

View on Semantic Scholar

Summary

The paper presents a novel, feature-guided sampling method to construct a balanced dataset of 1500 diverse UGC video clips for video compression research.
The paper demonstrates that traditional reference-based quality metrics like PSNR and SSIM are inadequate for UGC, advocating for the use of no-reference metrics such as SLEEQ.
The paper reveals that this dataset enables detailed evaluations of visual artifacts across varied content categories, paving the way for optimized video encoding algorithms.

Overview of "YouTube UGC Dataset for Video Compression Research"

The paper by Wang, Inguva, and Adsumilli introduces an extensive dataset of User Generated Content (UGC) from YouTube designed to address the challenges of video compression and quality assessment. The dataset comprises 1500 video clips spanning 15 categories, including Gaming, Sports, and High Dynamic Range (HDR), and is sampled from a vast pool of over 1.5 million videos. The authors emphasize the inadequacy of traditional reference-based quality metrics, such as PSNR and SSIM, in evaluating UGC, which often lacks pristine originals.

Dataset Composition and Sampling Methodology

The YouTube UGC dataset aims to provide a comprehensive sample of the vast array of video content available on YouTube. Videos were selected to cover a wide range of resolutions and categories, reflecting user preferences and technology advancements. The authors employed a novel sampling method using features extracted from video encoding logs, guided by four attributes—spatial complexity, color complexity, temporal complexity, and chunk variation. This method allowed for a balanced and representative dataset, enhancing its utility for academic research and practical video processing applications.

Challenges in UGC Video Compression and Quality Assessment

Traditional video compression and quality assessment research often assumes that original videos are pristine. However, UGC frequently contains visual artifacts and inconsistencies, making it challenging to apply conventional metrics. The paper highlights these challenges by demonstrating how typical reference-based metrics might produce misleading results when applied to UGC. This misalignment stems from the inability of these metrics to account effectively for the inherent imperfections in UGC content.

The authors propose employing no-reference quality metrics, including Noise, Banding, and the Self-reference-based Learning-free Evaluator of Quality (SLEEQ), to assess UGC more reliably. These metrics address specific artifact-related issues and better correlate with human perception in non-pristine scenarios.

Evaluation and Insights

Evaluation of the YouTube UGC dataset using these no-reference metrics reveals that most UGC maintains a reasonable level of visual quality, despite the diversity and intrinsic distortions present in user uploads. The distribution of detected artifacts, such as banding and noise, varies across content categories, highlighting areas where further advancements in compression algorithms might be necessary.

Implications and Future Directions

The introduction of this dataset is poised to facilitate more nuanced research in video compression and quality assessment tailored to the realities of UGC. It underscores the importance of developing metrics and algorithms that consider the non-pristine nature of much online content. Furthermore, this work raises pertinent questions about optimizing video encoding techniques to preserve quality in the presence of inherent artifacts, presenting a pathway for future research in improving the balance between video quality and compression efficiency.

By providing a well-curated resource, the paper catalyzes further exploration of video compression technologies and encourages the adoption of innovative evaluation methodologies. This work represents a significant step towards understanding and improving the infrastructure involved in handling the vast quantities of video data generated by internet users globally.

PDF Markdown