- The paper introduces a conversion method using a pan-tilt camera and ATIS sensor to generate realistic spiking datasets from static images.
- It emulates biological saccades to overcome display refresh artifacts, enabling robust comparisons with traditional Computer Vision methods.
- Initial benchmarks, including an 83.44% accuracy with SKIM on N-MNIST, highlight the method’s potential for advancing neuromorphic research.
Converting Static Image Datasets to Spiking Neuromorphic Datasets Using Saccades
The paper presents a method for converting static image datasets used in traditional Computer Vision into spiking datasets suitable for Neuromorphic Vision, addressing a critical gap in available resources for neuromorphic researchers. The authors leverage a pan-tilt camera platform to simulate biologically-inspired saccades, thereby converting the well-established MNIST and Caltech101 datasets into N-MNIST and N-Caltech101 spiking datasets.
Methodological Approach
The challenge in creating large Neuromorphic Vision datasets lies in the absence of existing recordings from neuromorphic sensors. Unlike frame-based datasets, spiking datasets must account for the dynamic nature of how neuromorphic sensors capture visual information. The authors’ approach involves motion-induced changes in image intensity captured by an Asynchronous Time-based Image Sensor (ATIS), which moves instead of the scene. This design emulates a more realistic biological sensing process and overcomes artifacts caused by display refresh rates in traditional simulation setups.
Technical Implementation
The conversion process utilizes actual sensor recordings to faithfully replicate the complexities and noise inherent in real-world conditions. By moving the sensor rather than relying on display motion, the method enhances the realism and utility of the converted datasets for algorithmic evaluation and research comparison. The use of saccadic motions enforces rotational movement, which aligns closely with biological perception mechanisms evident in human and primate vision.
Results on N-MNIST and N-Caltech101
The datasets presented as N-MNIST and N-Caltech101 allow for a seamless bridge between Computer Vision researchers and Neuromorphic Vision applications. The authors document initial performance benchmarks using existing recognition algorithms like k-Nearest Neighbour (kNN), Synaptic Kernel Inverse Method (SKIM), and HFIRST, achieving varied recognition accuracies and laying a foundation for future improvements. Notably, SKIM achieved an accuracy of 83.44% on the N-MNIST dataset in its application.
Implications and Future Directions
The datasets introduced constitute a significant contribution to the neuromorphic research community. They not only facilitate direct comparison with frame-based methods but also stimulate innovation in neuromorphic sensory processing. Moreover, the conversion method and results underscore the potential for improved neuromorphic algorithms and could inspire broader adoption of embodied sensor systems over static configurations. The paper provides a foundation for advancement in mobile, biologically-inspired sensing technologies, suggesting a pivot towards dynamic, real-world applicable vision systems.
Conclusion
This work bridges a substantial gap in Neuromorphic Vision research, providing scalable and comparable datasets for benchmarking. The use of biologically relevant motion patterns in conversion sets a novel precedent for dataset creation. As research into Neuromorphic Vision systems continues, these datasets will likely remain integral in driving deeper understanding and innovation in the field.