Papers
Topics
Authors
Recent
2000 character limit reached

Fixed Random Classifier Rearrangement for Continual Learning

Published 23 Feb 2024 in cs.LG and cs.AI | (2402.15227v1)

Abstract: With the explosive growth of data, continual learning capability is increasingly important for neural networks. Due to catastrophic forgetting, neural networks inevitably forget the knowledge of old tasks after learning new ones. In visual classification scenario, a common practice of alleviating the forgetting is to constrain the backbone. However, the impact of classifiers is underestimated. In this paper, we analyze the variation of model predictions in sequential binary classification tasks and find that the norm of the equivalent one-class classifiers significantly affects the forgetting level. Based on this conclusion, we propose a two-stage continual learning algorithm named Fixed Random Classifier Rearrangement (FRCR). In first stage, FRCR replaces the learnable classifiers with fixed random classifiers, constraining the norm of the equivalent one-class classifiers without affecting the performance of the network. In second stage, FRCR rearranges the entries of new classifiers to implicitly reduce the drift of old latent representations. The experimental results on multiple datasets show that FRCR significantly mitigates the model forgetting; subsequent experimental analyses further validate the effectiveness of the algorithm.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (35)
  1. Expert gate: Lifelong learning with a network of experts. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.  3366–3375, 2017.
  2. Memory aware synapses: Learning what (not) to forget. In Proceedings of the European conference on computer vision (ECCV), pp.  139–154, 2018.
  3. Class-incremental continual learning into the extended der-verse. Ieee Transactions on Pattern Analysis and Machine Intelligence, 45(5):5497–5512, 2023.
  4. A continual learning survey: Defying forgetting in classification tasks. Ieee Transactions on Pattern Analysis and Machine Intelligence, 44(7):3366–3385, 2022.
  5. Orthogonal gradient descent for continual learning. In International Conference on Artificial Intelligence and Statistics, pp.  3762–3773. PMLR, 2020.
  6. Pathnet: Evolution channels gradient descent in super neural networks. arXiv preprint arXiv:1701.08734, 2017.
  7. R. M. French. Catastrophic forgetting in connectionist networks. Trends in Cognitive Sciences, 1999.
  8. Robert French. Dynamically constraining connectionist networks to produce distributed, orthogonal representations to reduce catastrophic interference. Proceedings of the 16th Annual Cognitive Science Society Conference, 1994.
  9. An empirical investigation of catastrophic forgetting in gradient-based neural networks. Computer Science, 84(12):1387–91, 2013.
  10. Helpful or harmful: Inter-task association in continual learning. In European Conference on Computer Vision, pp.  519–535. Springer, 2022.
  11. Forget-free continual learning with soft-winning subnetworks. arXiv preprint arXiv:2303.14962, 2023.
  12. Achieving a better stability-plasticity trade-off via auxiliary networks in continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  11930–11939, 2023.
  13. Overcoming catastrophic forgetting in neural networks. Proceedings of the national academy of sciences, 114(13):3521–3526, 2017.
  14. Learning multiple layers of features from tiny images. Technical Report TR-2009, 2009.
  15. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998. ISSN 0018-9219.
  16. No fear of classifier biases: Neural collapse inspired federated learning with synthetic and fixed classifier. arXiv preprint arXiv:2303.10058, 2023.
  17. Towards better plasticity-stability trade-off in incremental learning: A simple linear connector. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  89–98, 2022.
  18. Pcr: Proxy-based contrastive replay for online class-incremental continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  24246–24255, 2023.
  19. Packnet: Adding multiple tasks to a single network by iterative pruning. In 31st IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE Conference on Computer Vision and Pattern Recognition, pp.  7765–7773, 2018.
  20. M. Mccloskey and N. J. Cohen. Catastrophic interference in connectionist networks: The sequential learning problem. Psychology of Learning and Motivation, 24:109–165, 1989.
  21. Effects of parameter norm growth during transformer training: Inductive bias from gradient descent. arXiv preprint arXiv:2010.09697, 2020.
  22. Linear mode connectivity in multitask and continual learning. arXiv preprint arXiv:2010.04495, 2020a.
  23. Understanding the role of training regimes in continual learning. In Advances in Neural Information Processing Systems, volume 33, pp.  7308–7320, 2020b.
  24. Prevalence of neural collapse during the terminal phase of deep learning training. Proceedings of the National Academy of Sciences, 117(40):24652–24663, 2020.
  25. R. Ratcliff. Connectionist models of recognition memory: constraints imposed by learning and forgetting functions. Psychological Review, 97(2):285–308, 1990.
  26. icarl: Incremental classifier and representation learning. In 30th IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE Conference on Computer Vision and Pattern Recognition, pp.  5533–5542, 2017.
  27. Progressive neural networks. arXiv preprint arXiv:1606.04671, 2016.
  28. Continual learning with deep generative replay. In 31st Annual Conference on Neural Information Processing Systems (NIPS), volume 30 of Advances in Neural Information Processing Systems, 2017.
  29. Gcr: Gradient coreset based replay buffer selection for continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  99–108, 2022.
  30. Afec: Active forgetting of negative transfer in continual learning. In 35th Conference on Neural Information Processing Systems (NeurIPS), volume 34 of Advances in Neural Information Processing Systems, 2021.
  31. Continual learning through retrieval and imagination. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pp.  8594–8602, 2022.
  32. Metamix: Towards corruption-robust continual learning with temporally self-adaptive data transformation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  24521–24531, 2023.
  33. Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747, 2017.
  34. Meta-attention for vit-backed continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  150–159, 2022.
  35. Inducing neural collapse in imbalanced learning: Do we really need a learnable classifier at the end of deep neural network? In Advances in Neural Information Processing Systems, volume 35, pp.  37991–38002, 2022.

Summary

We haven't generated a summary for this paper yet.

Whiteboard

Paper to Video (Beta)

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (2)

Collections

Sign up for free to add this paper to one or more collections.