BVI-RLV: A Fully Registered Dataset and Benchmarks for Low-Light Video Enhancement

Published 3 Jul 2024 in cs.CV | (2407.03535v2)

Abstract: Low-light videos often exhibit spatiotemporal incoherent noise, compromising visibility and performance in computer vision applications. One significant challenge in enhancing such content using deep learning is the scarcity of training data. This paper introduces a novel low-light video dataset, consisting of 40 scenes with various motion scenarios under two distinct low-lighting conditions, incorporating genuine noise and temporal artifacts. We provide fully registered ground truth data captured in normal light using a programmable motorized dolly and refine it via an image-based approach for pixel-wise frame alignment across different light levels. We provide benchmarks based on four different technologies: convolutional neural networks, transformers, diffusion models, and state space models (mamba). Our experimental results demonstrate the significance of fully registered video pairs for low-light video enhancement (LLVE) and the comprehensive evaluation shows that the models trained with our dataset outperform those trained with the existing datasets. Our dataset and links to benchmarks are publicly available at https://doi.org/10.21227/mzny-8c77.

Abstract PDF HTML Upgrade to Chat

References (37)

Citations (1)

View on Semantic Scholar

Summary

The paper introduces a fully registered low-light video dataset with 31,800 paired frames captured under varied conditions to improve enhancement models.
It employs a programmable motorized dolly and sophisticated post-processing to achieve precise pixel alignment for temporal consistency.
Benchmark evaluations show that models including CNNs, transformers, and diffusion techniques achieve notable gains in PSNR and SSIM using BVI-RLV.

Overview of BVI-RLV: A Fully Registered Dataset and Benchmarks for Low-Light Video Enhancement

Low-light video enhancement (LLVE) remains a challenging domain within computer vision due to the inherent issues caused by photon noise, color shifts, white balance inconsistencies, and other temporal artifacts. The paper "BVI-RLV: A Fully Registered Dataset and Benchmarks for Low-Light Video Enhancement" addresses these challenges by introducing a meticulously curated dataset, BVI-RLV, that includes fully registered low-light video sequences and their corresponding normal-light ground truths.

Dataset Introduction

The BVI-RLV dataset offers 40 dynamic scenes captured under two low-light conditions compared to a normal-light reference, covering a broad array of motion types and content variations, thus ensuring relevance for various real-world applications. The scenes are diversified by capturing both moving objects against static backgrounds and fully dynamic scenes with multiple types of camera and object motions.

To ensure pixel-wise alignment between low-light and normal-light video frames, the authors employ a programmable motorized dolly system coupled with sophisticated post-processing alignment techniques. This alignment is crucial for the temporal consistency required in machine learning models designed for LLVE. The dataset includes over 31,800 ground-truth paired frames, making it one of the most comprehensive datasets available for this task.

Benchmarks and Models

The paper evaluates the performance of multiple deep learning architectures using the BVI-RLV dataset. These models span a range of contemporary technologies:

Convolutional Neural Networks (CNNs): PCDUNet
Transformers: STA-SUNet
Diffusion Models: BVI-CDM
State Space Models: BVI-Mamba

These models are curated to require manageable computational resources, fostering broader accessibility for the research community. The performance metrics of these models, as trained on BVI-RLV, consistently outperform those trained on existing datasets (DRV, SDSD, DID), as evidenced by the comprehensive comparative analysis provided in the paper.

Comparative Results

The experimental results presented in the paper highlight the superiority of models trained with the BVI-RLV dataset. Notably, models like STA-SUNet and the novel BVI-CDM demonstrate significant improvements in PSNR and SSIM scores, illustrating the efficacy of using fully registered pairs for LLVE. Additionally, the research compares benchmark models across various datasets, establishing BVI-RLV as the most robust in terms of performance on unseen data.

Implications and Future Directions

The introduction of the BVI-RLV dataset has significant practical implications. The availability of a well-curated, comprehensive dataset stands to improve the development of more robust and generalizable LLVE models, benefiting applications in surveillance, robotics, and media production that necessitate high-quality video under low-light conditions.

The theoretical implications are equally profound. The dataset's use of fully registered video pairs across a variety of motions and content types provides a new standard for dataset quality in LLVE research. This methodology could inspire further research into alignment techniques and the integration of motion dynamics in other video enhancement domains.

Speculation on Future Developments

Looking forward, the creation of lighter, more efficient models that can handle large-scale video data without the need for extensive computational resources will be crucial. Additionally, expanding the dataset to include more diverse lighting conditions and environmental settings could further enhance the robustness of LLVE models. There is also potential for applying self-supervised or unpaired learning strategies that can leverage the BVI-RLV dataset to develop models capable of learning from less strictly controlled environments.

Finally, addressing the balance between enhancing video quality and the ethical considerations surrounding potential misuse of this technology will be an essential aspect of future research and application.

Conclusion

The BVI-RLV dataset and the comprehensive benchmarks provided by the authors represent a significant step forward in the field of low-light video enhancement. By combining rigorous data collection with state-of-the-art model evaluations, this work lays a robust foundation for future advancements and sets a new standard for quality in LLVE research.

Markdown

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Paper Prompts

Top Community Prompts

Explain it Like I'm 14

off on

Knowledge Gaps

off on

Practical Applications

off on

Glossary

off on

Conceptual Simplification

off on

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Generate Now

BVI-RLV: A Fully Registered Dataset and Benchmarks for Low-Light Video Enhancement

Summary

Overview of BVI-RLV: A Fully Registered Dataset and Benchmarks for Low-Light Video Enhancement

Dataset Introduction

Benchmarks and Models

Comparative Results

Implications and Future Directions

Speculation on Future Developments

Conclusion

Paper to Video (Beta)

Whiteboard

Paper Prompts

Top Community Prompts

Open Problems

Continue Learning

Authors (7)

Collections

Tweets

BVI-RLV: A Fully Registered Dataset and Benchmarks for Low-Light Video Enhancement

Summary

Overview of BVI-RLV: A Fully Registered Dataset and Benchmarks for Low-Light Video Enhancement

Dataset Introduction

Benchmarks and Models

Comparative Results

Implications and Future Directions

Speculation on Future Developments

Conclusion

Paper to Video (Beta)

Whiteboard

Paper Prompts

Top Community Prompts

Open Problems

Continue Learning

Related Papers

Authors (7)

Collections

Tweets