Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

STDP-based spiking deep convolutional neural networks for object recognition (1611.01421v3)

Published 4 Nov 2016 in cs.CV

Abstract: Previous studies have shown that spike-timing-dependent plasticity (STDP) can be used in spiking neural networks (SNN) to extract visual features of low or intermediate complexity in an unsupervised manner. These studies, however, used relatively shallow architectures, and only one layer was trainable. Another line of research has demonstrated - using rate-based neural networks trained with back-propagation - that having many layers increases the recognition robustness, an approach known as deep learning. We thus designed a deep SNN, comprising several convolutional (trainable with STDP) and pooling layers. We used a temporal coding scheme where the most strongly activated neurons fire first, and less activated neurons fire later or not at all. The network was exposed to natural images. Thanks to STDP, neurons progressively learned features corresponding to prototypical patterns that were both salient and frequent. Only a few tens of examples per category were required and no label was needed. After learning, the complexity of the extracted features increased along the hierarchy, from edge detectors in the first layer to object prototypes in the last layer. Coding was very sparse, with only a few thousands spikes per image, and in some cases the object category could be reasonably well inferred from the activity of a single higher-order neuron. More generally, the activity of a few hundreds of such neurons contained robust category information, as demonstrated using a classifier on Caltech 101, ETH-80, and MNIST databases. We also demonstrate the superiority of STDP over other unsupervised techniques such as random crops (HMAX) or auto-encoders. Taken together, our results suggest that the combination of STDP with latency coding may be a key to understanding the way that the primate visual system learns, its remarkable processing speed and its low energy consumption.

STDP-based Spiking Deep Convolutional Neural Networks for Object Recognition

The paper "STDP-based spiking deep convolutional neural networks for object recognition" presents a novel approach leveraging spiking neural networks (SNN) with spike-timing-dependent plasticity (STDP) as a learning mechanism, aimed at achieving object recognition tasks. Unlike conventional deep learning models that rely on supervised backpropagation and rate-based information encoding, this work employs an unsupervised approach, aligning more closely with biological neural processes, to extract and recognize features from visual inputs.

Key Contributions

  1. Model Architecture: The proposed model is a deep spiking neural network that incorporates multiple layers of convolution and pooling, designed to process visual data. The model employs a temporal coding scheme allowing neurons to fire in order of their activation strength, enhancing processing efficiency by focusing on early spikes as critical signal carriers.
  2. STDP Learning: The learning is governed by the STDP mechanism, meaning that the synaptic weights between neurons are adjusted based on the relative timing of spikes. This method has shown efficacy in extracting features of various complexities within the hierarchy, spanning from simple edges to complete object representations.
  3. Efficient Coding and Learning: The network demonstrates sparsity in its firing, substantially reducing the number of spikes needed for recognition. By using few examples per category in an unsupervised manner, the network can generalize effectively without requiring labeled data, thereby diminishing issues linked to extensive data labeling as in traditional setups.
  4. Performance Evaluation: When benchmarked against datasets like Caltech 101, ETH-80, and MNIST, the model exhibits strong categorization capabilities. Notably, it achieves a 99.1% accuracy on Caltech's face/motorbike categorization task, distinctively showing robustness to variations in training data size. On ETH-80, which involves significant intra-category diversity and viewpoint variance, an accuracy of 82.8% is reported, outperforming several prominent models. The model reaches a commendable 98.4% on the MNIST dataset, indicating its competitive performance with sparse inputs.

Implications and Future Directions

The results suggest that integrating STDP with temporal coding paves a potential path to more energy-efficient and biologically plausible neural networks, which could be particularly advantageous in neuromorphic hardware applications. The approach aligns well with the sparse firing observed in primate visual systems, suggesting that it may capture biologically inspired efficiency and robustness traits.

Practically, the system promises advancements in low-power consumption devices owing to its sparse spike usage, and theoretically, it contributes to our understanding of biologically feasible learning processes. Future developments could explore incorporating additional biological learning aids, like dopamine-driven reinforcement pathways, to enhance object recognition accuracy and handling of complex datasets.

In conclusion, the paper presents a pivotal move towards biologically inspired artificial vision systems, highlighting areas for further exploration to deepen the coherence between artificial models and biological systems. This contribution signifies a substantial step in pursuing models that not only mirror high-level recognition capacities of deep networks but also engage in energy-efficient and unsupervised learning methodologies reminiscent of the biological brain.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
Citations (379)