EV-FlowNet: Self-Supervised Optical Flow Estimation for Event-based Cameras (1802.06898v4)

Published 19 Feb 2018 in cs.CV and cs.RO

Abstract: Event-based cameras have shown great promise in a variety of situations where frame based cameras suffer, such as high speed motions and high dynamic range scenes. However, developing algorithms for event measurements requires a new class of hand crafted algorithms. Deep learning has shown great success in providing model free solutions to many problems in the vision community, but existing networks have been developed with frame based images in mind, and there does not exist the wealth of labeled data for events as there does for images for supervised training. To these points, we present EV-FlowNet, a novel self-supervised deep learning pipeline for optical flow estimation for event based cameras. In particular, we introduce an image based representation of a given event stream, which is fed into a self-supervised neural network as the sole input. The corresponding grayscale images captured from the same camera at the same time as the events are then used as a supervisory signal to provide a loss function at training time, given the estimated flow from the network. We show that the resulting network is able to accurately predict optical flow from events only in a variety of different scenes, with performance competitive to image based networks. This method not only allows for accurate estimation of dense optical flow, but also provides a framework for the transfer of other self-supervised methods to the event-based domain.

Authors (4)

Alex Zihao Zhu (13 papers)
Liangzhe Yuan (19 papers)
Kenneth Chaney (9 papers)
Kostas Daniilidis (119 papers)

Citations (395)

View on Semantic Scholar

Summary

The paper introduces a novel self-supervised deep learning framework that leverages event-based data for precise optical flow estimation.
It employs an innovative multi-channel event representation integrating temporal and spatial information into standard CNN architectures.
Empirical evaluations on the MVSEC dataset demonstrate competitive accuracy with reduced noise in low-texture regions compared to frame-based methods.

Insights on "EV-FlowNet: Self-Supervised Optical Flow Estimation for Event-based Cameras"

The paper "EV-FlowNet: Self-Supervised Optical Flow Estimation for Event-based Cameras" introduces a novel approach to optical flow estimation leveraging the unique advantages of event-based cameras. Event-based cameras offer substantial benefits over traditional frame-based cameras in scenarios with high-speed motion and challenging lighting conditions by capturing changes in the scene asynchronously. This research addresses the challenge of developing effective algorithms for such dynamic data, focusing specifically on optical flow estimation.

Overview of EV-FlowNet Approach

EV-FlowNet is a self-supervised deep learning framework designed for event-based cameras, marking an advancement in leveraging asynchronous event data for optical flow estimation. Traditional frame-based deep learning models are ill-suited for the asynchronous nature of event cameras, which necessitates a new approach devoid of hand-crafted algorithms or the need for extensive labeled datasets.

The key contributions of this work are twofold:

Event Representation: The authors propose an innovative image-based representation of events. This representation includes four channels: two channels for counting positive and negative events, and two channels for temporal annotations indicating the most recent events' timestamps. This facilitates integration into standard convolutional neural network (CNN) architectures while preserving spatial-temporal relationships inherent in event data.
Self-Supervised Learning: The approach relies on self-supervised learning by utilizing grayscale images synchronized with the event data. By incorporating a photometric loss based on the difference between observed and predicted images, the model obviates the need for labeled optical flow data, which is scarce for event-based cameras.

Empirical Evaluation and Results

The authors have demonstrated the efficacy of EV-FlowNet through comprehensive empirical evaluations using the Multi Vehicle Stereo Event Camera (MVSEC) dataset. The dataset includes sequences captured in diverse conditions (indoor flying, outdoor driving) with corresponding ground truth optical flow generated from depth data and vehicle poses.

Performance Metrics: The evaluations present results in terms of Average Endpoint Error (AEE) and percentage of outlier pixels. EV-FlowNet exhibits competitive performance, particularly for larger time window evaluations (dt=4), which correspond to larger optical flows, where traditional frame-based methods struggle.
Comparison to Prior Work: When tested against UnFlow, a frame-based self-supervised method adapted here for event cameras, EV-FlowNet showcases comparable accuracy with reduced noise in low-texture regions often challenging for traditional approaches. The combination of temporal and spatial data from event streams provides robust inputs for optical flow estimation, mitigating common pitfalls encountered by frame-based techniques in sparse or rapidly changing environments.

Implications and Future Directions

The implications of this research extend both theoretically and practically. Theoretically, this work demonstrates the potential for re-appropriating methodologies developed for frame-based systems to accommodate the unique characteristics of event-based sensors. Practically, this facilitation of self-supervised learning methods could accelerate the adoption of event-based cameras in autonomous systems, surveillance, and any domain requiring robust motion estimation under challenging conditions.

Looking forward, this research sets a precedent for integrating neural networks with asynchronous event data networks, which could lead to further advancements in event-based processing. Potential future work may explore integrating additional loss functions to enhance supervision solely from event data, facilitating application in environments challenging for traditional cameras. This continuous evolution and adaptation could result in event-based cameras achieving broader mainstream application in real-time, high-velocity environments.

PDF Markdown

Related Papers

YouTube

Show All Videos