Natural Mitigation of Catastrophic Interference: Continual Learning in Power-Law Learning Environments (2401.10393v3)
Abstract: Neural networks often suffer from catastrophic interference (CI): performance on previously learned tasks drops off significantly when learning a new task. This contrasts strongly with humans, who can continually learn new tasks without appreciably forgetting previous tasks. Prior work has explored various techniques for mitigating CI and promoting continual learning such as regularization, rehearsal, generative replay, and context-specific components. This paper takes a different approach, one guided by cognitive science research showing that in naturalistic environments, the probability of encountering a task decreases as a power-law of the time since it was last performed. We argue that techniques for mitigating CI should be compared against the intrinsic mitigation in simulated naturalistic learning environments. Thus, we evaluate the extent of the natural mitigation of CI when training models in power-law environments, similar to those humans face. Our results show that natural rehearsal environments are better at mitigating CI than existing methods, calling for the need for better evaluation processes. The benefits of this environment include simplicity, rehearsal that is agnostic to both tasks and models, and the lack of a need for extra neural circuitry. In addition, we explore popular mitigation techniques in power-law environments to create new baselines for continual learning research.
- A functional approach to external graph algorithms. Algorithmica, 32: 437–458.
- A random graph model for massive graphs. In Proceedings of the thirty-second annual ACM symposium on Theory of computing, 171–180.
- Reflections of the Environment in Memory. Psychological Science, 2(6): 396–408.
- Need probability affects retention: A direct demonstration. Memory and Cognition, 25(6): 867–872.
- Blowing in the wind: Using ‘North Wind and the Sun’texts to sample phoneme inventories. Journal of the International Phonetic Association, 52(3): 453–494.
- PopSignAI : Using Sign Language Recognition to Improve American Sign Language Learning in Novice Signers.
- Analyzing the spread of tweets in response to Paris attacks. Computers, Environment and Urban Systems, 71: 14–26.
- Domain Incremental Lifelong Learning in an Open World. arXiv:2305.06555.
- Sequential mastery of multiple visual tasks: Networks naturally learn to learn and forget to forget. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
- Deng, L. 2012. The MNIST Database of Handwritten Digit Images for Machine Learning Research [Best of the Web]. IEEE Signal Processing Magazine, 29(6): 141–142.
- An Empirical Investigation of Catastrophic Forgetting in Gradient-Based Neural Networks. arXiv:1312.6211.
- REMIND Your Neural Network to Prevent Catastrophic Forgetting. CoRR, abs/1910.02509.
- Lifelong Machine Learning with Deep Streaming Linear Discriminant Analysis. CoRR, abs/1909.01520.
- Radial structure of the Internet. Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, 463(2081): 1231–1246.
- Toward a protein–protein interaction map of the budding yeast: a comprehensive system to examine two-hybrid interactions in all possible combinations between the yeast proteins. Proceedings of the National Academy of Sciences, 97(3): 1143–1147.
- CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning. arXiv:1612.06890.
- FearNet: Brain-Inspired Model for Incremental Learning. CoRR, abs/1711.10563.
- Overcoming catastrophic forgetting in neural networks. Proceedings of the national academy of sciences, 114(13): 3521–3526.
- Overcoming catastrophic forgetting in neural networks. CoRR, abs/1612.00796.
- Learning multiple layers of features from tiny images.
- Overcoming catastrophic forgetting by incremental moment matching. Advances in neural information processing systems, 30.
- Learning without forgetting. IEEE transactions on pattern analysis and machine intelligence, 40(12): 2935–2947.
- Gradient Episodic Memory for Continuum Learning. CoRR, abs/1706.08840.
- On the frequency distribution of retweets. Procedia Computer Science, 31: 747–753.
- Catastrophic interference in neural network models is mitigated when the training data reflect a power-law environmental structure. In Proceedings of the Annual Meeting of the Cognitive Science Society, volume 44.
- Packnet: Adding multiple tasks to a single network by iterative pruning. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, 7765–7773.
- Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem. volume 24 of Psychology of Learning and Motivation, 109–165. Academic Press.
- Power-law scaling to assist with key challenges in artificial intelligence. Scientific reports, 10(1): 19628.
- Milojević, S. 2010. Power law distributions in information science: Making the case for logarithmic binning. Journal of the American Society for Information Science and Technology, 61(12): 2417–2425.
- Never-Ending Learning. Commun. ACM, 61(5): 103–115.
- Mapping the NFT revolution: market trends, trade networks, and visual features. Scientific reports, 11(1): 20902.
- Mechanisms of skill acquisition and the law of practice. In Cognitive skills and their acquisition, 1–55. Psychology Press.
- Newman, M. E. 2005. Power laws, Pareto distributions and Zipf’s law. Contemporary physics, 46(5): 323–351.
- Continual lifelong learning with neural networks: A review. Neural Networks, 113: 54–71.
- PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Wallach, H.; Larochelle, H.; Beygelzimer, A.; d'Alché-Buc, F.; Fox, E.; and Garnett, R., eds., Advances in Neural Information Processing Systems 32, 8024–8035. Curran Associates, Inc.
- iCaRL: Incremental Classifier and Representation Learning. CoRR, abs/1611.07725.
- Robins, A. 1995. Catastrophic forgetting, rehearsal and pseudorehearsal. Connection Science, 7(2): 123–146.
- Experience Replay for Continual Learning. CoRR, abs/1811.11682.
- Progressive Neural Networks. arXiv:1606.04671.
- The Role of Process in the Rational Analysis of Memory. Cognitive Psychology, 32(3): 219–250.
- Progress & Compress: A scalable framework for continual learning. arXiv:1805.06370.
- Overcoming catastrophic forgetting with hard attention to the task. In International conference on machine learning, 4548–4557. PMLR.
- Continual Learning with Deep Generative Replay. CoRR, abs/1705.08690.
- Continual Learning with Deep Generative Replay. arXiv:1705.08690.
- Reflections of the social environment in chimpanzee memory: applying rational analysis beyond humans. Royal Society Open Science, 3(8): 160293.
- Discovering Structure in Multiple Learning Tasks: The TC Algorithm. In Proceedings of the Thirteenth International Conference on International Conference on Machine Learning, ICML’96, 489–497. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc. ISBN 1558604197.
- Brain-inspired replay for continual learning with artificial neural networks. Nature Communications, 11(1): 4069.
- Three scenarios for continual learning. arXiv:1904.07734.
- Three types of incremental learning. Nature Machine Intelligence, 4(12): 1185–1197.
- A Comprehensive Survey of Continual Learning: Theory, Method and Application. arXiv:2302.00487.
- A Prompt Log Analysis of Text-to-Image Generation Systems. In Proceedings of the ACM Web Conference 2023, WWW ’23. ACM.
- An Online Incremental Learning Vector Quantization. In Theeramunkong, T.; Kijsirikul, B.; Cercone, N.; and Ho, T.-B., eds., Advances in Knowledge Discovery and Data Mining, 1046–1053. Berlin, Heidelberg: Springer Berlin Heidelberg. ISBN 978-3-642-01307-2.
- Continual learning through synaptic intelligence. In International conference on machine learning, 3987–3995. PMLR.