Towards Robust Continual Learning with Bayesian Adaptive Moment Regularization (2309.08546v3)
Abstract: The pursuit of long-term autonomy mandates that machine learning models must continuously adapt to their changing environments and learn to solve new tasks. Continual learning seeks to overcome the challenge of catastrophic forgetting, where learning to solve new tasks causes a model to forget previously learnt information. Prior-based continual learning methods are appealing as they are computationally efficient and do not require auxiliary models or data storage. However, prior-based approaches typically fail on important benchmarks and are thus limited in their potential applications compared to their memory-based counterparts. We introduce Bayesian adaptive moment regularization (BAdam), a novel prior-based method that better constrains parameter growth, reducing catastrophic forgetting. Our method boasts a range of desirable properties such as being lightweight and task label-free, converging quickly, and offering calibrated uncertainty that is important for safe real-world deployment. Results show that BAdam achieves state-of-the-art performance for prior-based methods on challenging single-headed class-incremental experiments such as Split MNIST and Split FashionMNIST, and does so without relying on task labels or discrete task boundaries.
- Memory aware synapses: Learning what (not) to forget. In Proceedings of the European conference on computer vision (ECCV), pages 139–154, 2018.
- Task-free continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2019a.
- Gradient based sample selection for online continual learning. Advances in neural information processing systems, 32, 2019b.
- Weight uncertainty in neural network. In International conference on machine learning, pages 1613–1622. PMLR, 2015.
- Streaming variational bayes. Advances in neural information processing systems, 26, 2013.
- End-to-end incremental learning. In Proceedings of the European conference on computer vision (ECCV), pages 233–248, 2018.
- Branch-specific dendritic ca2+ spikes cause persistent synaptic plasticity. Nature, 520(7546):180–185, 2015.
- Functional relevance of cross-modal plasticity in blind humans. Nature, 389(6647):180–183, 1997.
- Maintaining plasticity in deep continual learning. arXiv preprint arXiv:2306.13812, 2023.
- Neurogenesis deep learning: Extending deep networks to accommodate new classes. In 2017 International Joint Conference on Neural Networks (IJCNN), pages 526–533. IEEE, 2017.
- Towards robust evaluations of continual learning. arXiv preprint arXiv:1805.09733, 2018.
- Robert M French. Catastrophic forgetting in connectionist networks. Trends in cognitive sciences, 3(4):128–135, 1999.
- Alex Graves. Practical variational inference for neural networks. In J. Shawe-Taylor, R. Zemel, P. Bartlett, F. Pereira, and K.Q. Weinberger, editors, Advances in Neural Information Processing Systems, volume 24. Curran Associates, Inc., 2011.
- Neuroscience-inspired artificial intelligence. Neuron, 95(2):245–258, 2017.
- Robert Hecht-Nielsen. Theory of the backpropagation neural network. In Neural networks for perception, pages 65–93. Elsevier, 1992.
- Hands-on bayesian neural networks—a tutorial for deep learning users. IEEE Computational Intelligence Magazine, 17(2):29–48, 2022.
- Fast and scalable bayesian deep learning by weight-perturbation in adam. In International conference on machine learning, pages 2611–2620. PMLR, 2018.
- Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
- Overcoming catastrophic forgetting in neural networks. Proceedings of the national academy of sciences, 114(13):3521–3526, 2017.
- Learning without forgetting. IEEE transactions on pattern analysis and machine intelligence, 40(12):2935–2947, 2017.
- Avalanche: an end-to-end library for continual learning. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2nd Continual Learning in Computer Vision Workshop, 2021.
- Gradient episodic memory for continual learning. Advances in neural information processing systems, 30, 2017.
- Catastrophic interference in connectionist networks: The sequential learning problem. In Psychology of learning and motivation, volume 24, pages 109–165. Elsevier, 1989.
- Variational continual learning. arXiv preprint arXiv:1710.10628, 2017.
- Continual lifelong learning with neural networks: A review. Neural networks, 113:54–71, 2019.
- Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, 32, 2019.
- Encoder based lifelong learning. In Proceedings of the IEEE International Conference on Computer Vision, pages 1320–1328, 2017.
- icarl: Incremental classifier and representation learning. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pages 2001–2010, 2017.
- Experience replay for continual learning. Advances in Neural Information Processing Systems, 32, 2019.
- Progressive neural networks. arXiv preprint arXiv:1606.04671, 2016.
- Continual learning with deep generative replay. Advances in neural information processing systems, 30, 2017.
- Héctor J Sussmann. Uniqueness of the weights for minimal feedforward nets with a given input-output map. Neural networks, 5(4):589–593, 1992.
- Knowledge transfer in deep block-modular neural networks. In Conference on Biomimetic and Biohybrid Systems, pages 268–279. Springer, 2015.
- Lifelong robot learning. Robotics and Autonomous Systems, 15(1):25–46, 1995. ISSN 0921-8890. https://doi.org/10.1016/0921-8890(95)00004-Y. URL https://www.sciencedirect.com/science/article/pii/092188909500004Y. The Biology and Technology of Intelligent Autonomous Agents.
- Functional regularisation for continual learning using gaussian processes. International Conference on Learning Representations, 2020.
- Gido M Van de Ven and Andreas S Tolias. Three scenarios for continual learning. arXiv preprint arXiv:1904.07734, 2019.
- Continual learning with guarantees via weight interval constraints. In International Conference on Machine Learning, pages 23897–23911. PMLR, 2022.
- Continual learning through synaptic intelligence. In Doina Precup and Yee Whye Teh, editors, Proceedings of the 34th International Conference on Machine Learning, volume 70 of Proceedings of Machine Learning Research, pages 3987–3995. PMLR, 06–11 Aug 2017. URL https://proceedings.mlr.press/v70/zenke17a.html.
- Task agnostic continual learning using online variational bayes. arXiv preprint arXiv:1803.10123, 2018.