Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MFQE 2.0: A New Approach for Multi-frame Quality Enhancement on Compressed Video (1902.09707v6)

Published 26 Feb 2019 in cs.CV and cs.MM

Abstract: The past few years have witnessed great success in applying deep learning to enhance the quality of compressed image/video. The existing approaches mainly focus on enhancing the quality of a single frame, not considering the similarity between consecutive frames. Since heavy fluctuation exists across compressed video frames as investigated in this paper, frame similarity can be utilized for quality enhancement of low-quality frames given their neighboring high-quality frames. This task is Multi-Frame Quality Enhancement (MFQE). Accordingly, this paper proposes an MFQE approach for compressed video, as the first attempt in this direction. In our approach, we firstly develop a Bidirectional Long Short-Term Memory (BiLSTM) based detector to locate Peak Quality Frames (PQFs) in compressed video. Then, a novel Multi-Frame Convolutional Neural Network (MF-CNN) is designed to enhance the quality of compressed video, in which the non-PQF and its nearest two PQFs are the input. In MF-CNN, motion between the non-PQF and PQFs is compensated by a motion compensation subnet. Subsequently, a quality enhancement subnet fuses the non-PQF and compensated PQFs, and then reduces the compression artifacts of the non-PQF. Also, PQF quality is enhanced in the same way. Finally, experiments validate the effectiveness and generalization ability of our MFQE approach in advancing the state-of-the-art quality enhancement of compressed video. The code is available at https://github.com/RyanXingQL/MFQEv2.0.git.

Citations (181)

Summary

  • The paper introduces a new MFQE 2.0 method that leverages BiLSTM for peak frame detection and MF-CNN for multi-frame quality enhancement.
  • It effectively compensates for motion and extracts spatiotemporal features to reduce compression artifacts, achieving an average ΔPSNR improvement of 0.562 dB.
  • The approach generalizes across codecs like H.264 and HEVC, reducing quality fluctuations and improving the viewer experience in compressed videos.

Multi-Frame Quality Enhancement (MFQE) for Compressed Video: An Analytical Overview

The paper "MFQE 2.0: A New Approach for Multi-frame Quality Enhancement on Compressed Video" introduces a novel methodology aimed at improving the quality of compressed video by leveraging the inter-frame correlations that naturally exist in video streams. This involves enhancing the quality of degraded video frames (non-PQFs) using adjacent higher-quality frames termed as Peak Quality Frames (PQFs).

Methodology

The crux of this research is based on two core operations: detecting PQFs and exploiting them to enhance the non-PQFs. To achieve this, a Bidirectional Long Short-Term Memory (BiLSTM) network is used to identify PQFs in the video sequence. This implementation leverages both forward and backward temporal information to improve detection accuracy, important for ensuring that optimal frames are selected for quality enhancement purposes.

Upon detection of PQFs, the enhancement procedure involves an innovative Multi-Frame CNN (MF-CNN) approach comprising two subnets. An MC-subnet first compensates for motion across frames ensuring that temporal shifts between PQFs and non-PQFs are corrected. Subsequently, a QE-subnet extracts and processes multi-frame spatiotemporal information to reduce compression artifacts inherent in non-PQFs. The architecture effectively combines motion estimation, feature extraction, and mapping techniques to utilize information from neighboring PQFs for quality restoration.

Results and Comparisons

Empirical results demonstrate the superiority of the MFQE 2.0 approach over existing methods such as AR-CNN, DnCNN, Li et al., DCAD, and DS-CNN in both objective and subjective quality metrics. The paper quantifies its results using ΔPSNR, where the proposed method achieves an average improvement of 0.562 dB at QP = 37, significantly outperforming other tested solutions. Such enhancements are noted especially for non-PQFs, indicating the successful utilization of frame content similarity for quality restoration. Additionally, the approach reduces overall quality fluctuation in video streams which is critical for maintaining a stable viewer experience.

Moreover, the method exhibits a marked generalization capability across different codecs (H.264 and HEVC) and an ability to transfer performance effectively across datasets, as revealed by further testing on previously unseen sequences.

Implications and Future Directions

The implications of this research are profound for both practical and theoretical domains. Practically, the enhancement of video quality on the decoder side without altering existing compression standards has the potential to improve user experience significantly in bandwidth-constrained environments. Theoretically, this research provides an important foundation for extending cross-frame learning approaches to broader tasks in video processing systems, such as super-resolution or denoising tasks.

Looking ahead, addressing the specific challenge of embedding perceptual metrics in the enhancement process could bridge the gap between achieved PSNR improvements and real-world perceptual quality assessments. Additionally, collaborative enhancements with encoder-side information could prompt further improvements in compression artifact reductions and overall visual quality.

In conclusion, this work presents a methodologically sound and empirically validated approach for video quality enhancement, laying groundwork for future advancements in both compression artifact reduction and video processing techniques. The MFQE 2.0 approach is not only efficient but also adeptly utilizes redundant information inherent in video sequences to drive quality improvements.