Complementary Learning Subnetworks for Parameter-Efficient Class-Incremental Learning (2306.11967v1)
Abstract: In the scenario of class-incremental learning (CIL), deep neural networks have to adapt their model parameters to non-stationary data distributions, e.g., the emergence of new classes over time. However, CIL models are challenged by the well-known catastrophic forgetting phenomenon. Typical methods such as rehearsal-based ones rely on storing exemplars of old classes to mitigate catastrophic forgetting, which limits real-world applications considering memory resources and privacy issues. In this paper, we propose a novel rehearsal-free CIL approach that learns continually via the synergy between two Complementary Learning Subnetworks. Our approach involves jointly optimizing a plastic CNN feature extractor and an analytical feed-forward classifier. The inaccessibility of historical data is tackled by holistically controlling the parameters of a well-trained model, ensuring that the decision boundary learned fits new classes while retaining recognition of previously learned classes. Specifically, the trainable CNN feature extractor provides task-dependent knowledge separately without interference; and the final classifier integrates task-specific knowledge incrementally for decision-making without forgetting. In each CIL session, it accommodates new tasks by attaching a tiny set of declarative parameters to its backbone, in which only one matrix per task or one vector per class is kept for knowledge retention. Extensive experiments on a variety of task sequences show that our method achieves competitive results against state-of-the-art methods, especially in accuracy gain, memory cost, training efficiency, and task-order robustness. Furthermore, to make the non-growing backbone (i.e., a model with limited network capacity) suffice to train on more incoming tasks, a graceful forgetting implementation on previously learned trivial tasks is empirically investigated.
- R. Minhas, A. A. Mohammed, and Q. J. Wu, “Incremental learning in human action recognition based on snippets,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 22, no. 11, pp. 1529–1541, 2012.
- X. Mu, K. M. Ting, and Z.-H. Zhou, “Classification under streaming emerging new classes: A solution using completely-random trees,” IEEE Transactions on Knowledge and Data Engineering, vol. 29, no. 8, pp. 1605–1618, 2017.
- Q. Wang, G. Sun, J. Dong, Q. Wang, and Z. Ding, “Continuous multi-view human action recognition,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 32, no. 6, pp. 3603–3614, 2022.
- M. De Lange, R. Aljundi, M. Masana, S. Parisot, X. Jia, A. Leonardis, G. Slabaugh, and T. Tuytelaars, “A continual learning survey: Defying forgetting in classification tasks,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 7, pp. 3366–3385, 2022.
- D. S. Tan, Y.-X. Lin, and K.-L. Hua, “Incremental learning of multi-domain image-to-image translations,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 31, no. 4, pp. 1526–1539, 2021.
- M. McCloskey and N. J. Cohen, “Catastrophic interference in connectionist networks: The sequential learning problem,” in Psychology of Learning and Motivation. Elsevier, 1989, vol. 24, pp. 109–165.
- S. Thrun and T. M. Mitchell, “Lifelong robot learning,” Robotics and Autonomous Systems, vol. 15, no. 1-2, pp. 25–46, 1995.
- Z. Li and D. Hoiem, “Learning without forgetting,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 12, pp. 2935–2947, 2017.
- H. Liu, X. Zhu, Z. Lei, D. Cao, and S. Z. Li, “Fast adapting without forgetting for face recognition,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 31, no. 8, pp. 3093–3104, 2021.
- R. M. French, “Catastrophic forgetting in connectionist networks,” Trends in Cognitive Sciences, vol. 3, no. 4, pp. 128–135, 1999.
- S. Wang, W. Shi, S. Dong, X. Gao, X. Song, and Y. Gong, “Semantic knowledge guided class-incremental learning,” IEEE Transactions on Circuits and Systems for Video Technology, 2023, doi: 10.1109/TCSVT.2023.3262739.
- G. I. Parisi, R. Kemker, J. L. Part, C. Kanan, and S. Wermter, “Continual lifelong learning with neural networks: A review,” Neural Networks, vol. 113, pp. 54–71, 2019.
- M. Masana, X. Liu, B. Twardowski, M. Menta, A. D. Bagdanov, and J. van de Weijer, “Class-incremental learning: survey and performance evaluation on image classification,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 5, pp. 5513–5533, 2023.
- Q. Hu, Y. Gao, and B. Cao, “Curiosity-driven class-incremental learning via adaptive sample selection,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 32, no. 12, pp. 8660–8673, 2022.
- H. Lin, S. Feng, X. Li, W. Li, and Y. Ye, “Anchor assisted experience replay for online class-incremental learning,” IEEE Transactions on Circuits and Systems for Video Technology, 2022, doi: 10.1109/TCSVT.2022.3219605.
- S.-A. Rebuffi, A. Kolesnikov, G. Sperl, and C. H. Lampert, “iCaRL: Incremental classifier and representation learning,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 2001–2010.
- D. Lopez-Paz and M. Ranzato, “Gradient episodic memory for continual learning,” in Advances in Neural Information Processing Systems, vol. 30, 2017, pp. 6470–6479.
- R. Aljundi, F. Babiloni, M. Elhoseiny, M. Rohrbach, and T. Tuytelaars, “Memory aware synapses: Learning what (not) to forget,” in Proceedings of the European Conference on Computer Vision, 2018, pp. 139–154.
- J. Bang, H. Kim, Y. Yoo, J.-W. Ha, and J. Choi, “Rainbow memory: Continual learning with a memory of diverse samples,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 8218–8227.
- X. Liu, C. Wu, M. Menta, L. Herranz, B. Raducanu, A. D. Bagdanov, S. Jui, and J. v. de Weijer, “Generative feature replay for class-incremental learning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2020, pp. 226–227.
- G. M. van de Ven, H. T. Siegelmann, and A. S. Tolias, “Brain-inspired replay for continual learning with artificial neural networks,” Nature Communications, vol. 11, no. 1, pp. 1–14, 2020.
- R. Aljundi, E. Belilovsky, T. Tuytelaars, L. Charlin, M. Caccia, M. Lin, and L. Page-Caccia, “Online continual learning with maximal interfered retrieval,” in Advances in Neural Information Processing Systems, 2019, pp. 11 849–11 860.
- L. Wang, B. Lei, Q. Li, H. Su, J. Zhu, and Y. Zhong, “Triple-memory networks: A brain-inspired method for continual learning,” IEEE Transactions on Neural Networks and Learning Systems, vol. 33, no. 5, pp. 1925–1934, 2022.
- J. Kirkpatrick, R. Pascanu, N. Rabinowitz, J. Veness, G. Desjardins, A. A. Rusu, K. Milan, J. Quan, T. Ramalho, A. Grabska-Barwinska et al., “Overcoming catastrophic forgetting in neural networks,” Proceedings of the National Academy of Sciences, vol. 114, no. 13, pp. 3521–3526, 2017.
- F. Zenke, B. Poole, and S. Ganguli, “Continual learning through synaptic intelligence,” in International Conference on Machine Learning, 2017, pp. 3987–3995.
- J. Zhang, J. Zhang, S. Ghosh, D. Li, S. Tasci, L. Heck, H. Zhang, and C.-C. J. Kuo, “Class-incremental learning via deep model consolidation,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2020, pp. 1131–1140.
- A. Rosenfeld and J. K. Tsotsos, “Incremental learning through deep adaptation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 42, no. 3, pp. 651–663, 2018.
- J. Serrà, D. Suris, M. Miron, and A. Karatzoglou, “Overcoming catastrophic forgetting with hard attention to the task,” in International Conference on Machine Learning. PMLR, 2018, pp. 4548–4557.
- Z. Ke, B. Liu, N. Ma, H. Xu, and L. Shu, “Achieving forgetting prevention and knowledge transfer in continual learning,” in Advances in Neural Information Processing Systems, vol. 34, 2021, pp. 22 443–22 456.
- A. A. Rusu, N. C. Rabinowitz, G. Desjardins, H. Soyer, J. Kirkpatrick, K. Kavukcuoglu, R. Pascanu, and R. Hadsell, “Progressive neural networks,” arXiv preprint arXiv:1606.04671, 2016.
- V. K. Verma, K. J. Liang, N. Mehta, P. Rai, and L. Carin, “Efficient feature transformations for discriminative and generative continual learning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 13 865–13 875.
- W. Hu, Q. Qin, M. Wang, J. Ma, and B. Liu, “Continual learning by using information of each class holistically,” in Proceedings of the AAAI Conference on Artificial Intelligence, 2021, pp. 7797–7805.
- Z. Hu, Y. Li, J. Lyu, D. Gao, and N. Vasconcelos, “Dense network expansion for class incremental learning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 11 858–11 867.
- G. M. Van de Ven and A. S. Tolias, “Three scenarios for continual learning,” arXiv preprint arXiv:1904.07734, 2019.
- Y.-C. Hsu, Y.-C. Liu, A. Ramasamy, and Z. Kira, “Re-evaluating continual learning scenarios: A categorization and case for strong baselines,” arXiv preprint arXiv:1810.12488, 2018.
- S. Tang, D. Chen, J. Zhu, S. Yu, and W. Ouyang, “Layerwise optimization by gradient decomposition for continual learning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 9634–9643.
- E. Belouadah and A. Popescu, “IL2M: Class incremental learning with dual memory,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 583–592.
- S. Tong, X. Dai, Z. Wu, M. Li, B. Yi, and Y. Ma, “Incremental learning of structured memory via closed-loop transcription,” arXiv preprint arXiv:2202.05411, 2022.
- G. Zeng, Y. Chen, B. Cui, and S. Yu, “Continual learning of context-dependent processing in neural networks,” Nature Machine Intelligence, vol. 1, no. 8, pp. 364–372, 2019.
- Y.-M. Tang, Y.-X. Peng, and W.-S. Zheng, “Learning to imagine: Diversify memory for incremental learning using unlabeled data,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 9549–9558.
- S. Yan, J. Xie, and X. He, “DER: Dynamically expandable representation for class incremental learning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 3014–3023.
- F.-Y. Wang, D.-W. Zhou, H.-J. Ye, and D.-C. Zhan, “FOSTER: Feature boosting and compression for class-incremental learning,” in Proceedings of the European Conference on Computer Vision. Springer, 2022, pp. 398–414.
- A. Douillard, A. Ramé, G. Couairon, and M. Cord, “DyTox: Transformers for continual learning with dynamic token expansion,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 9285–9295.
- D.-W. Zhou, Q.-W. Wang, H.-J. Ye, and D.-C. Zhan, “A model or 603 exemplars: Towards memory-efficient class-incremental learning,” arXiv preprint arXiv:2205.13218, 2022.
- A. Rios, N. Ahuja, I. Ndiour, U. Genc, L. Itti, and O. Tickoo, “incDFM: Incremental deep feature modeling for continual novelty detection,” in Proceedings of the European Conference on Computer Vision. Springer, 2022, pp. 588–604.
- T.-Y. Wu, G. Swaminathan, Z. Li, A. Ravichandran, N. Vasconcelos, R. Bhotika, and S. Soatto, “Class-incremental learning with strong pre-trained models,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 9601–9610.
- T. L. Hayes, K. Kafle, R. Shrestha, M. Acharya, and C. Kanan, “Remind your neural network to prevent catastrophic forgetting,” in Proceedings of the European Conference on Computer Vision, 2020, pp. 466–483.
- Z. Wang, Z. Zhang, C.-Y. Lee, H. Zhang, R. Sun, X. Ren, G. Su, V. Perot, J. Dy, and T. Pfister, “Learning to prompt for continual learning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 139–149.
- Z. Wang, Z. Zhang, S. Ebrahimi, R. Sun, H. Zhang, C.-Y. Lee, X. Ren, G. Su, V. Perot, J. Dy et al., “Dualprompt: Complementary prompting for rehearsal-free continual learning,” in Proceedings of the European Conference on Computer Vision. Springer, 2022, pp. 631–648.
- D.-W. Zhou, Q.-W. Wang, Z.-H. Qi, H.-J. Ye, D.-C. Zhan, and Z. Liu, “Deep class-incremental learning: A survey,” arXiv preprint arXiv:2302.03648, 2023.
- S. I. Mirzadeh, A. Chaudhry, D. Yin, T. Nguyen, R. Pascanu, D. Gorur, and M. Farajtabar, “Architecture matters in continual learning,” arXiv preprint arXiv:2202.00275, 2022.
- C. L. P. Chen and Z. Liu, “Broad learning system: An effective and efficient incremental learning system without the need for deep architecture,” IEEE Transactions on Neural Networks and Learning Systems, vol. 29, no. 1, pp. 10–24, 2017.
- D. Wang and M. Li, “Stochastic configuration networks: Fundamentals and algorithms,” IEEE Transactions on Cybernetics, vol. 47, no. 10, pp. 3466–3479, 2017.
- R. Tibshirani, “Regression shrinkage and selection via the lasso,” Journal of the Royal Statistical Society: Series B (Methodological), vol. 58, no. 1, pp. 267–288, 1996.
- S. Boyd, N. Parikh, E. Chu, B. Peleato, J. Eckstein et al., “Distributed optimization and statistical learning via the alternating direction method of multipliers,” Foundations and Trends® in Machine learning, vol. 3, no. 1, pp. 1–122, 2011.
- A. J. Smola, S. Vishwanathan, and E. Eskin, “Laplace propagation,” in Advances in Neural Information Processing Systems, 2004, pp. 441–448.
- T. P. Minka, “A family of algorithms for approximate bayesian inference,” Ph.D. dissertation, Massachusetts Institute of Technology, 2001.
- D. J. MacKay, “A practical bayesian framework for backpropagation networks,” Neural Computation, vol. 4, no. 3, pp. 448–472, 1992.
- J. Martens, “New insights and perspectives on the natural gradient method,” arXiv preprint arXiv:1412.1193, 2014.
- R. Pascanu and Y. Bengio, “Revisiting natural gradient for deep networks,” arXiv preprint arXiv:1301.3584, 2013.
- F. Huszár, “On quadratic penalties in elastic weight consolidation,” arXiv preprint arXiv:1712.03847, 2017.
- P. Pan, S. Swaroop, A. Immer, R. Eschenhagen, R. Turner, and M. E. E. Khan, “Continual deep learning by functional regularisation of memorable past,” in Advances in Neural Information Processing Systems, vol. 33, 2020, pp. 4453–4464.
- D. Deng, G. Chen, J. Hao, Q. Wang, and P.-A. Heng, “Flattening sharpness for dynamic gradient projection memory benefits continual learning,” in Advances in Neural Information Processing Systems, vol. 34, 2021, pp. 18 710–18 721.
- R. Wang, Y. Bao, B. Zhang, J. Liu, W. Zhu, and G. Guo, “Anti-retroactive interference for lifelong learning,” in Proceedings of the European Conference on Computer Vision, 2022, pp. 163–178.
- G. M. Van de Ven and A. S. Tolias, “Generative replay with feedback connections as a general strategy for continual learning,” arXiv preprint arXiv:1809.10635, 2018.
- M. Wołczyk, K. Piczak, B. Wójcik, L. Pustelnik, P. Morawiecki, J. Tabor, T. Trzcinski, and P. Spurek, “Continual learning with guarantees via weight interval constraints,” in International Conference on Machine Learning. PMLR, 2022, pp. 23 897–23 911.
- J. Rajasegaran, M. Hayat, S. Khan, F. S. Khan, and L. Shao, “Random path selection for incremental learning,” in Advances in Neural Information Processing Systems, 2019, pp. 12 669–12 679.
- G. Sokar, D. C. Mocanu, and M. Pechenizkiy, “Spacenet: Make free space for continual learning,” Neurocomputing, vol. 439, pp. 1–11, 2021.
- X. Chen and K. He, “Exploring simple siamese representation learning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 15 750–15 758.
- L. Van der Maaten and G. Hinton, “Visualizing data using t-SNE.” Journal of Machine Learning Research, vol. 9, no. 11, 2008.
- Depeng Li (8 papers)
- Zhigang Zeng (28 papers)