COMISR: Compression-Informed Video Super-Resolution (2105.01237v2)

Published 4 May 2021 in cs.CV

Abstract: Most video super-resolution methods focus on restoring high-resolution video frames from low-resolution videos without taking into account compression. However, most videos on the web or mobile devices are compressed, and the compression can be severe when the bandwidth is limited. In this paper, we propose a new compression-informed video super-resolution model to restore high-resolution content without introducing artifacts caused by compression. The proposed model consists of three modules for video super-resolution: bi-directional recurrent warping, detail-preserving flow estimation, and Laplacian enhancement. All these three modules are used to deal with compression properties such as the location of the intra-frames in the input and smoothness in the output frames. For thorough performance evaluation, we conducted extensive experiments on standard datasets with a wide range of compression rates, covering many real video use cases. We showed that our method not only recovers high-resolution content on uncompressed frames from the widely-used benchmark datasets, but also achieves state-of-the-art performance in super-resolving compressed videos based on numerous quantitative metrics. We also evaluated the proposed method by simulating streaming from YouTube to demonstrate its effectiveness and robustness. The source codes and trained models are available at https://github.com/google-research/google-research/tree/master/comisr.

Citations (33)

View on Semantic Scholar

Summary

The paper's main contribution is COMISR, a new model that integrates three modules to effectively mitigate compression-induced distortions in video super-resolution.
It demonstrates that incorporating compression-aware techniques like bi-directional recurrent warping and Laplacian enhancement significantly improves PSNR and SSIM metrics.
The study underscores the benefit of mixed training with compressed and uncompressed frames, enhancing model generalization for real-world applications such as video streaming.

COMISR: Compression-Informed Video Super-Resolution

The paper "COMISR: Compression-Informed Video Super-Resolution" addresses the significant challenge of video super-resolution (VSR) for compressed video streams, commonly found on the web and mobile devices. Existing super-resolution (SR) methods typically focus on high-resolution restoration from low-resolution frames without accounting for compression artifacts. This research introduces a new model, COMpression-Informed video Super-Resolution (COMISR), designed to handle compressed video inputs effectively.

Core Contributions

The primary contribution of the paper is the COMISR model, which integrates three novel modules: bi-directional recurrent warping, detail-preserving flow estimation, and Laplacian enhancement. These components are tailored to mitigate compression-induced distortions while enhancing the VSR process:

Bi-Directional Recurrent Warping: This module compensates for the unknown location of intra-frames in compressed videos, thereby reducing error accumulation in temporal sequences.
Detail-Preserving Flow Estimation: By focusing on estimating both low-resolution and high-resolution optical flows, the model better retains intricate details that might otherwise be lost due to compression.
Laplacian Enhancement: This feature emphasizes high-frequency image content, counteracting the smoothing effect of video compression and preserving fine details.

The methodology disrupts traditional VSR approaches by introducing compression awareness at the network design level. The proposed model not only excels on standard uncompressed benchmarks but significantly outperforms the current state-of-the-art on a variety of compression levels as demonstrated through quantitative metrics such as PSNR and SSIM.

Experimental Evaluation

The authors conducted comprehensive experiments using datasets like Vid4 and REDS4, showcasing outstanding performance on compressed inputs with constant rate factor (CRF) values ranging from 15 to 35. An interesting aspect revealed through experiments is that preprocessing inputs with denoising algorithms before applying VSR methods, typically degrades performance. This highlights the necessity of an integrated compression-informed approach, as implemented in COMISR.

A significant insight from the research is the suggestion that training VSR models with mixed inputs of compressed and uncompressed frames fosters improved generalization and robustness. This insight should prompt further exploration into adaptive training regimes for VSR models.

Implications and Future Directions

The practical implications of COMISR highlight its potential applicability in real-world scenarios such as video streaming platforms (e.g., YouTube), where compressed videos are prevalent. The model's ability to handle varying levels of compression smoothly positions it as a valuable tool for enhancing viewing experiences across different network conditions and device capabilities.

Theoretically, this research opens pathways for more exploration into compression-aware neural network architectures. Future iterations of this work could delve into adaptive models that dynamically adjust to varying compression techniques and degrees encountered in practice. Additionally, expanding research could investigate the implications of such models on broader fields of image and video processing, including enhanced encoding techniques and compression artifacts detection.

In summary, COMISR represents a significant advancement in the field of video super-resolution, tailored to address the practical challenges posed by real-world video compression. Its pioneering approach serves as both a foundation and an inspiration for subsequent research aimed at bridging the gap between theoretical SR developments and industrial applications.

PDF Markdown