NTIRE 2021 Challenge on Quality Enhancement of Compressed Video: Methods and Results
The paper presents a detailed overview of the NTIRE 2021 Challenge, which focused on quality enhancement for compressed videos utilizing a newly introduced dataset, the Large-scale Diverse Video (LDV) dataset. This essay outlines the key methods and results from the challenge, emphasizing the technical approaches taken by the participating teams within the context of three distinct tracks (Tracks 1, 2, and 3) aimed at enhancing video quality using different metrics and compression methods.
Challenge Overview
The NTIRE 2021 Challenge consisted of three tracks: Tracks 1 and 3 aimed at enhancing videos compressed using HEVC and x265 encoders, respectively, under fixed QP and bit-rate conditions, with evaluations based on fidelity metrics such as PSNR and MS-SSIM. Track 2, however, focused on perceptual quality improvements, judged by MOS scores and other perceptual metrics like LPIPS and FID.
Methods and Techniques
A variety of approaches were presented by the participating teams to tackle the challenges set forth in each track. Some key highlights include:
- BILIBILI AI {content} FDU Team: They proposed a Spatiotemporal Model with Gated Fusion for fidelity tracks and a perceptual extension for the perceptual track. Their architecture employed deformable convolutions and channel attention mechanisms enhanced by gated fusion.
- NTU-SLab Team: They introduced the BasicVSR++ method, which improved upon BasicVSR through grid propagation and flow-guided deformable alignment to efficiently capture and align spatiotemporal features.
- VUE Team: Leveraged BasicVSR with multi-stage training and ensemble techniques for fidelity tracks, while proposing an innovative adaptive spatial-temporal fusion for perceptual quality enhancement.
- NOAHTCV Team: Implemented a multi-scale network with a deformable temporal fusion mechanism, using a shared U-Net model for feature extraction and alignment, optimized through tailored loss functions for each track.
- MT.MaxClear Team: Their work focused on stability by introducing regularization techniques in deformable convolutions, enhancing existing EDVR frameworks by offset stabilization for continuity across frames.
Results and Implications
The results indicated varied success across different methodologies, with BILIBILI AI {content} FDU and NTU-SLab teams consistently achieving top results across multiple tracks. The challenge facilitated a deeper understanding of effective methodologies for video compression artifact reduction, with emphasis on trade-offs between computational efficiency and image fidelity. The novel contributions of grid propagation and constrained deformable alignment have shown promise in advancing state-of-the-art techniques in video quality enhancement.
Future Directions
The NTIRE 2021 Challenge highlights the increasing complexity and demands in providing high-quality video content under resource constraints. Future research directions may include:
- Developing models that balance speed and quality for real-time applications.
- Applying novel architectures like transformers for video enhancement tasks.
- Robustness against diverse video content, ensuring reliable enhancements regardless of scene complexity or motion characteristics.
Conclusion
The NTIRE 2021 challenge successfully showcased the breadth of methods applicable to video quality enhancement, from traditional convolutional networks to advanced methods employing optical flow and deformable convolutions. Continued advancements in this domain will likely leverage the collective insights gained through challenges like NTIRE, driving innovations in efficient video streaming and consumption.