The Dormant Neuron Phenomenon in Deep Reinforcement Learning
Abstract: In this work we identify the dormant neuron phenomenon in deep reinforcement learning, where an agent's network suffers from an increasing number of inactive neurons, thereby affecting network expressivity. We demonstrate the presence of this phenomenon across a variety of algorithms and environments, and highlight its effect on learning. To address this issue, we propose a simple and effective method (ReDo) that Recycles Dormant neurons throughout training. Our experiments demonstrate that ReDo maintains the expressive power of networks by reducing the number of dormant neurons and results in improved performance.
- An optimistic perspective on offline reinforcement learning. In International Conference on Machine Learning, pp. 104–114. PMLR, 2020.
- Deep reinforcement learning at the edge of the statistical precipice. Advances in neural information processing systems, 34:29304–29320, 2021.
- The impact of reinitialization on generalization in convolutional neural networks. arXiv preprint arXiv:2109.00267, 2021.
- Lifting the veil on hyper-parameters for value-based deep reinforcement learning. In Deep RL Workshop NeurIPS 2021, 2021. URL https://openreview.net/forum?id=Ws4v7nSqqb.
- Single-shot pruning for offline reinforcement learning. arXiv preprint arXiv:2112.15579, 2021.
- On warm-starting neural network training. Advances in Neural Information Processing Systems, 33:3884–3894, 2020.
- The arcade learning environment: An evaluation platform for general agents. Journal of Artificial Intelligence Research, 47:253–279, 2013.
- Autonomous navigation of stratospheric balloons using reinforcement learning. Nature, 588(7836):77–82, 2020.
- Why would the brain need dormant neuronal precursors? Frontiers in Neuroscience, 16, 2022.
- Functional integration of neuronal precursors in the adult murine piriform cortex. Cerebral cortex, 30(3):1499–1515, 2020.
- Interference and generalization in temporal difference learning. In International Conference on Machine Learning, pp. 767–777. PMLR, 2020.
- A study on the plasticity of neural networks. CoRR, abs/2106.00042, 2021. URL https://arxiv.org/abs/2106.00042.
- Jax: composable transformations of python+ numpy programs. 2018.
- Dopamine: A Research Framework for Deep Reinforcement Learning. 2018. URL http://arxiv.org/abs/1812.06110.
- Randomized ensembled double q-learning: Learning fast without a model. In International Conference on Learning Representations, 2020.
- Nest: A neural network synthesis tool based on a grow-and-prune paradigm. IEEE Transactions on Computers, 68(10):1487–1497, 2019.
- Continual backprop: Stochastic gradient descent with persistent randomness. arXiv preprint arXiv:2108.06325, 2021.
- Impala: Scalable distributed deep-rl with importance weighted actor-learner architectures. In International conference on machine learning, pp. 1407–1416. PMLR, 2018.
- Gradmax: Growing neural networks using gradient information. In International Conference on Learning Representations, 2021.
- Secant: Self-expert cloning for zero-shot generalization of visual policies. In International Conference on Machine Learning, pp. 3088–3099. PMLR, 2021.
- Revisiting fundamentals of experience replay. In International Conference on Machine Learning, pp. 3061–3071. PMLR, 2020.
- Diagnosing bottlenecks in deep q-learning algorithms. In International Conference on Machine Learning, pp. 2021–2030. PMLR, 2019.
- The state of sparse training in deep reinforcement learning. In International Conference on Machine Learning, pp. 7766–7792. PMLR, 2022.
- TF-Agents: A library for reinforcement learning in tensorflow. https://github.com/tensorflow/agents, 2018. URL https://github.com/tensorflow/agents. [Online; accessed 25-June-2019].
- An empirical study of implicit regularization in deep offline rl. arXiv preprint arXiv:2207.02099, 2022.
- Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In International conference on machine learning, pp. 1861–1870. PMLR, 2018.
- Stabilizing deep q-learning with convnets and vision transformers under data augmentation. In Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P., and Vaughan, J. W. (eds.), Advances in Neural Information Processing Systems, volume 34, pp. 3680–3693. Curran Associates, Inc., 2021. URL https://proceedings.neurips.cc/paper/2021/file/1e0f65eb20acbfb27ee05ddc000b50ec-Paper.pdf.
- Array programming with numpy. Nature, 585(7825):357–362, 2020.
- Rainbow: Combining improvements in deep reinforcement learning. In Thirty-second AAAI conference on artificial intelligence, 2018.
- Deep learning scaling is predictable, empirically. arXiv preprint arXiv:1712.00409, 2017.
- Dropout q-functions for doubly efficient reinforcement learning. In International Conference on Learning Representations, 2021.
- Hunter, J. D. Matplotlib: A 2d graphics environment. Computing in science & engineering, 9(03):90–95, 2007.
- Transient non-stationarity and generalisation in deep reinforcement learning. In International Conference on Learning Representations, 2020.
- When to trust your model: Model-based policy optimization. Advances in Neural Information Processing Systems, 32, 2019.
- Model based reinforcement learning for atari. In International Conference on Learning Representations, 2019.
- Scaling laws for neural language models. arXiv preprint arXiv:2001.08361, 2020.
- Adam: A method for stochastic optimization. In Bengio, Y. and LeCun, Y. (eds.), 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, 2015. URL http://arxiv.org/abs/1412.6980.
- A survey of generalisation in deep reinforcement learning. arXiv preprint arXiv:2111.09794, 2021.
- Learning multiple layers of features from tiny images. 2009.
- Implicit under-parameterization inhibits data-efficient deep reinforcement learning. In International Conference on Learning Representations, 2021a.
- Dr3: Value-based deep reinforcement learning requires explicit regularization. In International Conference on Learning Representations, 2021b.
- Rifle: Backpropagation in depth for deep transfer learning through re-initializing the fully-connected layer. In International Conference on Machine Learning, pp. 6010–6019. PMLR, 2020.
- Continuous control with deep reinforcement learning. In ICLR (Poster), 2016.
- Lin, L.-J. Self-improving reactive agents based on reinforcement learning, planning and teaching. Machine learning, 8(3):293–321, 1992.
- Understanding and preventing capacity loss in reinforcement learning. In International Conference on Learning Representations, 2021.
- Human-level control through deep reinforcement learning. nature, 518(7540):529–533, 2015.
- The primacy bias in deep reinforcement learning. In International Conference on Machine Learning, pp. 16828–16847. PMLR, 2022.
- Oliphant, T. E. Python for scientific computing. Computing in Science & Engineering, 9(3):10–20, 2007. doi: 10.1109/MCSE.2007.58.
- Puterman, M. L. Markov decision processes: discrete stochastic dynamic programming. John Wiley & Sons, 2014.
- Cellular plasticity in the adult murine piriform cortex: continuous maturation of dormant precursors into excitatory neurons. Cerebral Cortex, 28(7):2610–2621, 2018.
- The impact of neural network overparameterization on gradient confusion and stochastic gradient descent. In International conference on machine learning, pp. 8469–8479. PMLR, 2020.
- Mastering the game of go with deep neural networks and tree search. nature, 529(7587):484–489, 2016.
- Dynamic sparse training for deep reinforcement learning. In International Joint Conference on Artificial Intelligence, 2022.
- Sutton, R. S. Learning to predict by the methods of temporal differences. Machine learning, 3(1):9–44, 1988.
- Reinforcement learning: An introduction. MIT press, 2018.
- Knowledge evolution in neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12843–12852, 2021.
- Rlx2: Training a sparse deep reinforcement learning model from scratch. arXiv preprint arXiv:2205.15043, 2022.
- Mujoco: A physics engine for model-based control. In 2012 IEEE/RSJ international conference on intelligent robots and systems, pp. 5026–5033. IEEE, 2012.
- Deep reinforcement learning and the deadly triad. CoRR, abs/1812.02648, 2018. URL http://arxiv.org/abs/1812.02648.
- When to use parametric models in reinforcement learning? Advances in Neural Information Processing Systems, 32, 2019.
- Python reference manual. Centrum voor Wiskunde en Informatica Amsterdam, 1995.
- Improving generalization in reinforcement learning with mixture regularization. In Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M., and Lin, H. (eds.), Advances in Neural Information Processing Systems, volume 33, pp. 7968–7978. Curran Associates, Inc., 2020. URL https://proceedings.neurips.cc/paper/2020/file/5a751d6a0b6ef05cfe51b86e5d1458e6-Paper.pdf.
- Splitting steepest descent for growing neural architectures. Advances in neural information processing systems, 32, 2019.
- Image augmentation is all you need: Regularizing deep reinforcement learning from pixels. In International Conference on Learning Representations, 2021. URL https://openreview.net/forum?id=GY6-6sTvGaf.
- Lifelong learning with dynamically expandable networks. In International Conference on Learning Representations, 2018.
- When does re-initialization work? arXiv preprint arXiv:2206.10011, 2022.
- Scaling vision transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12104–12113, 2022.
- Online incremental feature learning with denoising autoencoders. In Artificial intelligence and statistics, pp. 1453–1461. PMLR, 2012.
- Fortuitous forgetting in connectionist networks. In International Conference on Learning Representations, 2021.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.