DiffVSR: Enhancing Real-World Video Super-Resolution with Diffusion Models for Advanced Visual Quality and Temporal Consistency (2501.10110v2)

Published 17 Jan 2025 in cs.CV

Abstract: Diffusion models have demonstrated exceptional capabilities in image generation and restoration, yet their application to video super-resolution faces significant challenges in maintaining both high fidelity and temporal consistency. We present DiffVSR, a diffusion-based framework for real-world video super-resolution that effectively addresses these challenges through key innovations. For intra-sequence coherence, we develop a multi-scale temporal attention module and temporal-enhanced VAE decoder that capture fine-grained motion details. To ensure inter-sequence stability, we introduce a noise rescheduling mechanism with an interweaved latent transition approach, which enhances temporal consistency without additional training overhead. We propose a progressive learning strategy that transitions from simple to complex degradations, enabling robust optimization despite limited high-quality video data. Extensive experiments demonstrate that DiffVSR delivers superior results in both visual quality and temporal consistency, setting a new performance standard in real-world video super-resolution.

PDF Abstract

Summarize Bookmark Chat (Pro)

Authors (9)

Xiaohui Li (26 papers)
Yihao Liu (85 papers)
Shuo Cao (121 papers)
Ziyan Chen (17 papers)
Shaobin Zhuang (12 papers)
Xiangyu Chen (84 papers)
Yinan He (34 papers)
Yi Wang (1038 papers)
Yu Qiao (563 papers)

YouTube

Show All Videos

DiffVSR: Enhancing Real-World Video Super-Resolution with Diffusion Models for Advanced Visual Quality and Temporal Consistency (2501.10110v2)

Related Papers

YouTube