Deep Reinforcement Learning with Spiking Q-learning (2201.09754v3)
Abstract: With the help of special neuromorphic hardware, spiking neural networks (SNNs) are expected to realize AI with less energy consumption. It provides a promising energy-efficient way for realistic control tasks by combining SNNs with deep reinforcement learning (RL). There are only a few existing SNN-based RL methods at present. Most of them either lack generalization ability or employ Artificial Neural Networks (ANNs) to estimate value function in training. The former needs to tune numerous hyper-parameters for each scenario, and the latter limits the application of different types of RL algorithm and ignores the large energy consumption in training. To develop a robust spike-based RL method, we draw inspiration from non-spiking interneurons found in insects and propose the deep spiking Q-network (DSQN), using the membrane voltage of non-spiking neurons as the representation of Q-value, which can directly learn robust policies from high-dimensional sensory inputs using end-to-end RL. Experiments conducted on 17 Atari games demonstrate the DSQN is effective and even outperforms the ANN-based deep Q-network (DQN) in most games. Moreover, the experiments show superior learning stability and robustness to adversarial attacks of DSQN.
- A solution to the learning dilemma for recurrent networks of spiking neurons. Nature communications, 11(1):1–15, 2020.
- The arcade learning environment: An evaluation platform for general agents. Journal of Artificial Intelligence Research, 47:253–279, 2013.
- Six-legged walking in insects: how cpgs, peripheral feedback, and descending signals generate coordinated and adaptive motor rhythms. Journal of neurophysiology, 119(2):459–475, 2018.
- End to end learning of spiking neural network based on r-stdp for a lane keeping vehicle. In 2018 IEEE international conference on robotics and automation (ICRA), pages 4725–4732. IEEE, 2018.
- A survey of robotics control based on learning-inspired spiking neural networks. Frontiers in neurorobotics, 12:35, 2018.
- The heidelberg spiking data sets for the systematic evaluation of spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems, 2020.
- Loihi: A neuromorphic manycore processor with on-chip learning. Ieee Micro, 38(1):82–99, 2018.
- Spikingjelly. https://github.com/fangwei123456/spikingjelly, 2020. Accessed: 2021-12-01.
- Deep residual learning in spiking neural networks. In Thirty-Fifth Conference on Neural Information Processing Systems, 2021.
- Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 2661–2671, 2021.
- Neuromodulated spike-timing-dependent plasticity, and theory of three-factor learning rules. Frontiers in neural circuits, 9:85, 2016.
- Reinforcement learning using a continuous time actor-critic framework with spiking neurons. PLoS computational biology, 9(4):e1003024, 2013.
- The spinnaker project. Proceedings of the IEEE, 102(5):652–665, 2014.
- Neuronal dynamics: From single neurons to networks and models of cognition. Cambridge University Press, 2014.
- Eligibility traces and plasticity on behavioral time scales: experimental support of neohebbian three-factor learning rules. Frontiers in neural circuits, 12:53, 2018.
- Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572, 2014.
- Mark Horowitz. 1.1 computing’s energy problem (and what we can do about it). In 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC), pages 10–14. IEEE, 2014.
- Memory and information processing in neuromorphic systems. Proceedings of the IEEE, 103(8):1379–1397, 2015.
- On-chip trainable hardware-based deep q-networks approximating a backpropagation algorithm. Neural Computing and Applications, pages 1–12, 2021.
- Visual explanations from spiking neural networks using inter-spike intervals. Scientific reports, 11(1):1–14, 2021.
- Synaptic electronics: materials, devices and applications. Nanotechnology, 24(38):382001, 2013.
- Training deep spiking neural networks using backpropagation. Frontiers in neuroscience, 10:508, 2016.
- Human-level control through directly-trained deep spiking q-networks. arXiv preprint arXiv:2201.07211, 2021.
- Wolfgang Maass. Networks of spiking neurons: the third generation of neural network models. Neural networks, 10(9):1659–1671, 1997.
- Navigating mobile robots to target in near shortest time using reinforcement learning with spiking neural networks. In 2017 International Joint Conference on Neural Networks (IJCNN), pages 2243–2250. IEEE, 2017.
- A million spiking-neuron integrated circuit with a scalable communication network and interface. Science, 345(6197):668–673, 2014.
- Human-level control through deep reinforcement learning. nature, 518(7540):529–533, 2015.
- Improved robustness of reinforcement learning policies upon conversion to spiking neuronal network platforms applied to atari breakout game. Neural Networks, 120:108–115, 2019.
- Norse - a deep learning library for spiking neural networks, January 2021. Documentation: https://norse.ai/docs/.
- Towards spike-based machine intelligence with neuromorphic computing. Nature, 575(7784):607–617, 2019.
- Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in neuroscience, 11:682, 2017.
- A comprehensive analysis on adversarial robustness of spiking neural networks. In 2019 International Joint Conference on Neural Networks (IJCNN), pages 1–8. IEEE, 2019.
- Inherent adversarial robustness of deep spiking neural networks: Effects of discrete input encoding and non-linear activations. In European Conference on Computer Vision, pages 399–414. Springer, 2020.
- A general reinforcement learning algorithm that masters chess, shogi, and go through self-play. Science, 362(6419):1140–1144, 2018.
- Reinforcement learning: An introduction. MIT press, 2018.
- Strategy and benchmark for converting deep q-networks to event-driven spiking neural networks. arXiv preprint arXiv:2009.14456, 2020.
- Reinforcement co-learning of deep and spiking neural networks for energy-efficient mapless navigation with neuromorphic hardware. In 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 6090–6097. IEEE, 2020.
- Deep reinforcement learning with population-coded spiking neural network for continuous control. arXiv preprint arXiv:2010.09635, 2020.
- Mujoco: A physics engine for model-based control. In 2012 IEEE/RSJ international conference on intelligent robots and systems, pages 5026–5033. IEEE, 2012.
- Population-coding and dynamic-neurons improved spiking actor network for reinforcement learning. arXiv preprint arXiv:2106.07854, 2021.