Deep Reinforcement Learning with Population-Coded Spiking Neural Network for Continuous Control (2010.09635v1)

Published 19 Oct 2020 in cs.NE, cs.LG, and cs.RO

Abstract: The energy-efficient control of mobile robots is crucial as the complexity of their real-world applications increasingly involves high-dimensional observation and action spaces, which cannot be offset by limited on-board resources. An emerging non-Von Neumann model of intelligence, where spiking neural networks (SNNs) are run on neuromorphic processors, is regarded as an energy-efficient and robust alternative to the state-of-the-art real-time robotic controllers for low dimensional control tasks. The challenge now for this new computing paradigm is to scale so that it can keep up with real-world tasks. To do so, SNNs need to overcome the inherent limitations of their training, namely the limited ability of their spiking neurons to represent information and the lack of effective learning algorithms. Here, we propose a population-coded spiking actor network (PopSAN) trained in conjunction with a deep critic network using deep reinforcement learning (DRL). The population coding scheme dramatically increased the representation capacity of the network and the hybrid learning combined the training advantages of deep networks with the energy-efficient inference of spiking networks. To show the general applicability of our approach, we integrated it with a spectrum of both on-policy and off-policy DRL algorithms. We deployed the trained PopSAN on Intel's Loihi neuromorphic chip and benchmarked our method against the mainstream DRL algorithms for continuous control. To allow for a fair comparison among all methods, we validated them on OpenAI gym tasks. Our Loihi-run PopSAN consumed 140 times less energy per inference when compared against the deep actor network on Jetson TX2, and had the same level of performance. Our results support the efficiency of neuromorphic controllers and suggest our hybrid RL as an alternative to deep learning, when both energy-efficiency and robustness are important.

Citations (61)

View on Semantic Scholar

Summary

The paper proposes PopSAN, a population-coded spiking neural network integrated with deep reinforcement learning for highly energy-efficient continuous control.
Evaluations show PopSAN achieves comparable performance while consuming 140x less energy on neuromorphic hardware than traditional deep learning on standard AI.
Population coding improves SNN representation capacity, essential for energy-efficient complex continuous control tasks on neuromorphic hardware.

Deep Reinforcement Learning with Population-Coded Spiking Neural Network for Continuous Control

This paper by Tang et al. explores an innovative approach to continuous control in mobile robotic systems employing spiking neural networks (SNNs) executed on neuromorphic hardware. The primary focus is on achieving energy-efficient control, especially relevant for real-world applications that demand high-dimensional observation and action spaces. The authors introduce a population-coded spiking actor network, termed PopSAN, integrated with deep reinforcement learning (DRL) algorithms to significantly enhance the representation capacity and energy efficiency of SNNs.

Background and Objectives

The utilization of SNNs, especially when implemented on neuromorphic processors, offers a promising alternative to traditional Von Neumann architectures due to their low energy consumption, particularly suitable for mobile robots with limited computing resources. Traditional DRL methods, although effective in optimizing control policies, suffer from high energy costs, making them suboptimal for certain applications. The challenge addressed in the paper lies in scaling SNNs to handle high-dimensional control tasks, overcoming limitations in neuron representation and learning algorithms.

Methodology

The authors propose a novel architecture combining SNNs with population coding—a technique inspired by biological neural representations. This strategy improves the network's ability to encode information by representing observations and actions through populations of neurons with learnable receptive fields. The research integrates PopSAN with both on-policy and off-policy DRL methods, including DDPG, TD3, SAC, and PPO, showcasing widespread applicability across different DRL strategies.

The PopSAN is trained alongside a deep critic network, leveraging SNN's capacity for asynchronous event-based computation, which is more energy-efficient compared to deep networks requiring large continuous inference. It is then deployed on Intel's Loihi neuromorphic chip, illustrating significant energy savings compared to traditional approaches using deep actor networks.

Results and Implications

Empirical evaluations on OpenAI gym tasks demonstrate that PopSAN achieves comparable performance to existing DRL methods while consuming substantially less energy—140 times less than deployment on standard AI hardware like Jetson TX2. This finding suggests considerable practical implications for scenarios requiring robust control and energy conservation such as autonomous navigation and resource-limited environments.

The authors further delve into the importance of population coding, achieving superior encoding of inputs and outputs compared to rate-coded single-neuron representations. This enhancement in representation capacity is pivotal for complex high-dimensional continuous control tasks, suggesting an important shift in spiking neural network training methodologies.

Future Directions

The research illustrates potential pathways for future AI developments, particularly in sectors where energy efficiency is paramount. Potential advancements could arise from leveraging event-driven sensors or employing memristive neuromorphic processors, enhancing efficiency further than currently possible with digital solutions. Moreover, the integration of DRL algorithms into the spiking domain posits interesting opportunities for reinforcement learning where sample efficiency, environment stochasticity, and safety considerations dictate optimal strategies.

Conclusion

The work by Tang et al. provides a significant contribution to the field of energy-efficient neuromorphic solutions for robotic control. By effectively merging population-coded SNNs with DRL algorithms, the paper presents a compelling alternative to traditional deep learning approaches. It sets the stage for broader exploration and utilization of neuromorphic computing in real-world reinforcement learning applications, underpinning the capabilities of popSAN as a viable option where both energy-efficiency and high performance are essential.

Related Papers

GitHub

GitHub - combra-lab/pop-spiking-deep-rl: DRL with population coded spiking neural network for optimal and energy-efficient continuous control. (59 stars)