FastDVDnet: Towards Real-Time Deep Video Denoising Without Flow Estimation (1907.01361v2)

Published 1 Jul 2019 in cs.CV, cs.GR, cs.LG, and eess.IV

Abstract: In this paper, we propose a state-of-the-art video denoising algorithm based on a convolutional neural network architecture. Until recently, video denoising with neural networks had been a largely under explored domain, and existing methods could not compete with the performance of the best patch-based methods. The approach we introduce in this paper, called FastDVDnet, shows similar or better performance than other state-of-the-art competitors with significantly lower computing times. In contrast to other existing neural network denoisers, our algorithm exhibits several desirable properties such as fast runtimes, and the ability to handle a wide range of noise levels with a single network model. The characteristics of its architecture make it possible to avoid using a costly motion compensation stage while achieving excellent performance. The combination between its denoising performance and lower computational load makes this algorithm attractive for practical denoising applications. We compare our method with different state-of-art algorithms, both visually and with respect to objective quality metrics.

Citations (221)

View on Semantic Scholar

Summary

The paper introduces FastDVDnet, a deep CNN that achieves real-time video denoising by bypassing explicit optical flow estimation.
It employs a cascaded multi-scale U-Net design processing five consecutive frames with residual learning to efficiently remove noise.
Empirical results demonstrate competitive performance with state-of-the-art methods while reducing computational overhead and flickering.

FastDVDnet: Toward Real-Time Deep Video Denoising Without Flow Estimation

The paper introduces FastDVDnet, a deep neural network-based algorithm for video denoising, which stands distinct from conventional methods by bypassing the need for motion estimation. Unlike methods reliant on optical flow, FastDVDnet incorporates the motion processing implicitly within its architecture, thereby reducing computational complexity and enhancing runtime efficiency.

Algorithmic Framework and Design

FastDVDnet employs a two-step cascaded convolutional neural network (CNN) architecture, where each step consists of multi-scale U-Net-like denoising blocks. This setup capitalizes on the spatio-temporal information present in video frames. The network operates on five consecutive frames to predict the denoised version of the central frame, employing a residual learning strategy to estimate the noise component. This architecture facilitates handling varying levels of noise without losing performance, a feat accomplished by incorporating a noise map, thus eliminating the need for multiple models corresponding to different noise intensities.

The design choice of avoiding explicit motion estimation stands out as FastDVDnet is capable of efficiently dealing with motion-related challenges in video denoising. Conventional methods demand expensive computational efforts for accurate optical flow estimation, susceptible to errors especially under occlusions or noise. FastDVDnet's aggregated multi-scale approach inherently learns misalignment and compensates for it, thereby circumventing flow-related artifacts.

Performance and Results

In empirical evaluations, FastDVDnet demonstrates competitive or superior performance compared to state-of-the-art methods such as VNLB, V-BM4D, VNLnet, and DVDnet. For high noise levels, FastDVDnet maintains robustness and displays minimal flickering—a critical aspect for temporal coherence in video sequences. Quantitative assessment on datasets such as DAVIS and Set8 reveals that FastDVDnet not only achieves remarkably fast runtimes, being several orders of magnitude quicker than VNLB and other neural network-based solutions, but also delivers high-quality denoising results both visually and metrically (PSNR and ST-RRED scores).

Implications and Future Directions

The implications of FastDVDnet are twofold: practical and theoretical. Practically, it introduces a viable path towards real-time video denoising, accommodating the constraints of modern high-definition and ultra high-definition video formats. Theoretically, it challenges and possibly reshapes the paradigm of video denoising by advocating an implicit motion compensation strategy over explicit approaches.

Looking forward, while FastDVDnet addresses the denoising of Gaussian noise in this paper, its foundational architecture could be adapted to tackle spatially varying and other complex noise models. Furthermore, the concept of implicitly handling frame misalignment without dedicated modules might inspire innovations as the field shifts towards unsupervised methodologies and more generalized frameworks for handling diverse video artifacts.

In conclusion, FastDVDnet exemplifies a significant advancement toward efficient video denoising, harmonizing computational pragmatism with denoising efficacy, and setting a benchmark for future methodologies in the domain.

PDF Markdown

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Generate Now