High Speed and High Dynamic Range Video with an Event Camera (1906.07165v1)

Published 15 Jun 2019 in cs.CV

Abstract: Event cameras are novel sensors that report brightness changes in the form of a stream of asynchronous "events" instead of intensity frames. They offer significant advantages with respect to conventional cameras: high temporal resolution, high dynamic range, and no motion blur. While the stream of events encodes in principle the complete visual signal, the reconstruction of an intensity image from a stream of events is an ill-posed problem in practice. Existing reconstruction approaches are based on hand-crafted priors and strong assumptions about the imaging process as well as the statistics of natural images. In this work we propose to learn to reconstruct intensity images from event streams directly from data instead of relying on any hand-crafted priors. We propose a novel recurrent network to reconstruct videos from a stream of events, and train it on a large amount of simulated event data. During training we propose to use a perceptual loss to encourage reconstructions to follow natural image statistics. We further extend our approach to synthesize color images from color event streams. Our network surpasses state-of-the-art reconstruction methods by a large margin in terms of image quality (> 20%), while comfortably running in real-time. We show that the network is able to synthesize high framerate videos (> 5,000 frames per second) of high-speed phenomena (e.g. a bullet hitting an object) and is able to provide high dynamic range reconstructions in challenging lighting conditions. We also demonstrate the effectiveness of our reconstructions as an intermediate representation for event data. We show that off-the-shelf computer vision algorithms can be applied to our reconstructions for tasks such as object classification and visual-inertial odometry and that this strategy consistently outperforms algorithms that were specifically designed for event data.

Citations (501)

View on Semantic Scholar

Summary

The paper introduces a novel recurrent network that learns video reconstruction from event data without relying on hand-crafted priors.
The method achieves over 20% improvement in image quality metrics while synthesizing videos in real-time at more than 5000 frames per second.
The approach extends to color image synthesis from event streams, bridging event-based sensors with conventional computer vision applications.

High-Speed and High Dynamic Range Video with an Event Camera

The paper introduces an innovative approach to reconstructing high-speed and high dynamic range (HDR) videos using event cameras, addressing a key challenge in leveraging their full potential in computer vision. Event cameras differ from traditional cameras by capturing changes in brightness as asynchronous events, which inherently provides advantages such as high temporal resolution, high dynamic range, and an absence of motion blur.

Key Contributions

Reconstruction Approach: The authors propose a novel recurrent network specifically designed for video reconstruction from event data. Unlike previous methods reliant on hand-crafted priors, this approach directly learns from data, employing a large amount of simulated event data for training.
Perceptual Loss: The paper utilizes a perceptual loss during training to enforce alignment with natural image statistics. This encourages the generation of images that are not only accurate but also visually appealing.
Color Image Synthesis: Extending the reconstruction from monochrome to color, the research also demonstrates the capability to synthesize color images from event streams with color data, further broadening the applicability.
Performance Metrics: The network outperformed state-of-the-art methods with an over 20% improvement in image quality metrics. It also achieved real-time processing capabilities, synthesizing videos at high frame rates (>5000 frames per second).
Generalization to Diverse Conditions: Quantitative results show robust performance across varied scenarios—high-speed, HDR, and color settings demonstrating the versatility of the approach.

Impact and Applications

The method detailed in the paper has significant implications:

Practical Applications: The ability to generate high-speed and HDR video opens up new possibilities in fields such as sports analysis, industrial inspection, and surveillance where traditional cameras fall short.
Intermediate Representation: By transforming event data into intensity images, this approach serves as a bridge to apply standard computer vision algorithms directly on event data, shown to be effective in tasks like object classification and visual-inertial odometry.
Future Research: The release of reconstruction code and pre-trained models is intended to foster additional research, suggesting opportunities for improving performance, exploring alternative architectures, and developing new applications.

Future Directions in AI

Looking forward, the integration of event-based vision and recurrent networks presents promising directions for AI. There is potential for further optimizations, such as leveraging sparsity in event data for computational efficiency and exploring hardware accelerations specifically designed for neuro-inspired sensors.

In summary, this work marks a significant step forward in harnessing the unique capabilities of event cameras, aiming to resolve longstanding challenges in high-speed and HDR video reconstruction. By bridging gaps between event-based sensors and conventional computer vision methodologies, the research provides a robust foundation and opens pathways for future advancements in intelligent visual systems.

PDF Markdown