2000 character limit reached
Noisy Spiking Actor Network for Exploration (2403.04162v1)
Published 7 Mar 2024 in cs.LG and cs.NE
Abstract: As a general method for exploration in deep reinforcement learning (RL), NoisyNet can produce problem-specific exploration strategies. Spiking neural networks (SNNs), due to their binary firing mechanism, have strong robustness to noise, making it difficult to realize efficient exploration with local disturbances. To solve this exploration problem, we propose a noisy spiking actor network (NoisySAN) that introduces time-correlated noise during charging and transmission. Moreover, a noise reduction method is proposed to find a stable policy for the agent. Extensive experimental results demonstrate that our method outperforms the state-of-the-art performance on a wide range of continuous control tasks from OpenAI gym.
- Fully spiking actor network with intra-layer connections for reinforcement learning. arXiv preprint arXiv:2401.05444, 2024.
- Neural networks and neuroscience-inspired computer vision. Current Biology, 24(18):921–929, 2014.
- Pink noise is all you need: Colored noise exploration in deep reinforcement learning. In The Eleventh International Conference on Learning Representations, 2022.
- Noisy networks for exploration. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings. OpenReview.net, 2018. URL https://openreview.net/forum?id=rywHCPkAW.
- Addressing function approximation error in actor-critic methods. In International conference on machine learning, pp. 1587–1596. PMLR, 2018.
- Continuous deep q-learning with model-based acceleration. In International conference on machine learning, pp. 2829–2838. PMLR, 2016.
- Soft actor-critic algorithms and applications. arXiv preprint arXiv:1812.05905, 2018.
- Nrowan-dqn: A stable noisy network with noise reduction and online weight adjustment for exploration. Expert Systems with Applications, 203:117343, 2022.
- Rainbow: Combining improvements in deep reinforcement learning. In Proceedings of the AAAI conference on artificial intelligence, volume 32, 2018.
- Action noise in off-policy deep reinforcement learning: Impact on exploration and performance. arXiv preprint arXiv:2206.03787, 2022.
- A noise-based novel strategy for faster snn training. Neural Computation, 35(9):1593–1608, 2023.
- Exploiting noise as a resource for computation and learning in spiking neural networks. Patterns, 4(10):100831, 2023. doi: 10.1016/J.PATTER.2023.100831. URL https://doi.org/10.1016/j.patter.2023.100831.
- Improved robustness of reinforcement learning policies upon conversion to spiking neuronal network platforms applied to atari breakout game. Neural Networks, 120:108–115, 2019.
- Parameter space noise for exploration. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings. OpenReview.net, 2018. URL https://openreview.net/forum?id=ByBAl2eAZ.
- Towards spike-based machine intelligence with neuromorphic computing. Nature, 575(7784):607–617, 2019.
- Mastering atari, go, chess and shogi by planning with a learned model. Nature, 588(7839):604–609, 2020.
- Neuronal variability: noise or part of the signal? Nature Reviews Neuroscience, 6(5):389–397, 2005.
- Reinforcement learning: An introduction. MIT press, 2018.
- Reinforcement co-learning of deep and spiking neural networks for energy-efficient mapless navigation with neuromorphic hardware. In 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 6090–6097. IEEE, 2020.
- Deep reinforcement learning with population-coded spiking neural network for continuous control. In Conference on Robot Learning, pp. 2016–2029. PMLR, 2021.
- Membrane noise. Progress in biophysics and molecular biology, 28:189–265, 1974.
- Multiscale dynamic coding improved spiking actor network for reinforcement learning. In Thirty-sixth AAAI conference on artificial intelligence, 2022.
- A novel noise injection-based training scheme for better model robustness. arXiv preprint arXiv:2302.10802, 2023.