Unsupervised Learning of a Hierarchical Spiking Neural Network for Optical Flow Estimation: From Events to Global Motion Perception (1807.10936v2)

Published 28 Jul 2018 in cs.CV

Abstract: The combination of spiking neural networks and event-based vision sensors holds the potential of highly efficient and high-bandwidth optical flow estimation. This paper presents the first hierarchical spiking architecture in which motion (direction and speed) selectivity emerges in an unsupervised fashion from the raw stimuli generated with an event-based camera. A novel adaptive neuron model and stable spike-timing-dependent plasticity formulation are at the core of this neural network governing its spike-based processing and learning, respectively. After convergence, the neural architecture exhibits the main properties of biological visual motion systems, namely feature extraction and local and global motion perception. Convolutional layers with input synapses characterized by single and multiple transmission delays are employed for feature and local motion perception, respectively; while global motion selectivity emerges in a final fully-connected layer. The proposed solution is validated using synthetic and real event sequences. Along with this paper, we provide the cuSNN library, a framework that enables GPU-accelerated simulations of large-scale spiking neural networks. Source code and samples are available at https://github.com/tudelft/cuSNN.

Citations (140)

View on Semantic Scholar

Summary

The paper introduces a novel unsupervised hierarchical spiking neural network that employs an adaptive neuron model and a stable STDP rule to learn motion selectivity from event-based sensor data.
The architecture integrates SS-Conv, merge, MS-Conv, pooling, and dense layers to efficiently extract spatial and temporal features for local and global motion perception.
Experimental results on synthetic and real event sequences demonstrate natural emergence of motion direction and speed selectivity, paving the way for low-power real-time visual processing.

Unsupervised Learning of a Hierarchical Spiking Neural Network for Optical Flow Estimation: From Events to Global Motion Perception

The integration of spiking neural networks (SNNs) with event-based vision sensors presents a promising avenue for efficient optical flow estimation characterized by high-bandwidth and low-latency capabilities. The paper delineates a novel hierarchical spiking architecture capable of unsupervised learning of motion selectivity from raw data, specifically from an event-based camera. This framework utilizes a new adaptive neuron model and a stable design of spike-timing-dependent plasticity (STDP) to build an SNN that closely emulates biological visual systems in its motion processing capabilities.

Contributions and Methodology

Adaptive Neuron Model: The paper introduces an adaptation to the leaky integrate-and-fire (LIF) neuron model, facilitating neuronal response adjustment to fluctuating input statistics inherent to event-based sensors. This is pivotal for aligning neuron excitability to the variable firing rates generated by the moving scenes captured by such sensors.
Stable STDP Rule: A unique STDP implementation is proposed, which ensures stability through the use of an inherently balanced long-term potentiation (LTP) and long-term depression (LTD) without requiring additional stabilizing mechanisms. The weight updates incorporate dependencies on both current synaptic weights and normalized presynaptic traces, converging naturally to equilibrium states that reflect significant synaptic relevance.
Hierarchical SNN Architecture:

The network is composed of various layers: - SS-Conv Layer: Extracts spatial features from the input. - Merge Layer: Aggregates features to form a unified representation. - MS-Conv Layer: Identifies local motion using spatiotemporal convolutional kernels. - Pooling Layer: Reduces spatial dimensionality for efficient global motion perception. - Dense Layer: Develops global motion selectivity through full connectivity.

Numerical Results

The network is validated using both synthetic and real event sequences, showcasing its capability to learn and infer motion direction and speed selectivity. MS-Conv kernels specialize distinctly in different motion directions and speeds, achieving a selectivity that emerges naturally through the unsupervised learning process. Furthermore, the Dense layer demonstrates the ability to perceive global motion effectively by capturing the hierarchical integration of local motion vectors.

Implications and Future Directions

The implications of this work are multifaceted, particularly in advancing efficient neuromorphic computing methods tailored for real-time processing in domains like micro air vehicles (MAVs) and autonomous driving. This bio-inspired approach to motion perception heralds a potential shift in how visual information can be processed with low-power consumption while maintaining high temporal resolution.

Significantly, the introduction of a stable learning rule like the proposed STDP variant could lay the foundation for further research in neural plasticity mechanisms applicable across broader AI applications. Future work may delve into extending this model to more complex dynamic environments, enriching its adaptive capacity through reinforcement learning paradigms or integrating multi-modal sensor fusion for comprehensive scene understanding. Moreover, exploration into the deployment of this architecture on neuromorphic hardware could catalyze the development of fast, low-power devices capable of sophisticated real-time visual processing tasks.

PDF Markdown

Related Papers

GitHub

GitHub - tudelft/cuSNN: Spiking Neural Networks in C++ with strong GPU acceleration through CUDA (123 stars)

YouTube

Show All Videos