AdaptCL: Adaptive Continual Learning for Tackling Heterogeneity in Sequential Datasets (2207.11005v3)
Abstract: Managing heterogeneous datasets that vary in complexity, size, and similarity in continual learning presents a significant challenge. Task-agnostic continual learning is necessary to address this challenge, as datasets with varying similarity pose difficulties in distinguishing task boundaries. Conventional task-agnostic continual learning practices typically rely on rehearsal or regularization techniques. However, rehearsal methods may struggle with varying dataset sizes and regulating the importance of old and new data due to rigid buffer sizes. Meanwhile, regularization methods apply generic constraints to promote generalization but can hinder performance when dealing with dissimilar datasets lacking shared features, necessitating a more adaptive approach. In this paper, we propose AdaptCL, a novel adaptive continual learning method to tackle heterogeneity in sequential datasets. AdaptCL employs fine-grained data-driven pruning to adapt to variations in data complexity and dataset size. It also utilizes task-agnostic parameter isolation to mitigate the impact of varying degrees of catastrophic forgetting caused by differences in data similarity. Through a two-pronged case study approach, we evaluate AdaptCL on both datasets of MNIST Variants and DomainNet, as well as datasets from different domains. The latter include both large-scale, diverse binary-class datasets and few-shot, multi-class datasets. Across all these scenarios, AdaptCL consistently exhibits robust performance, demonstrating its flexibility and general applicability in handling heterogeneous datasets.
- J. He and F. Zhu, “Online continual learning for visual food classification,” in Proceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 2337–2346.
- H. Zhao, H. Wang, Y. Fu, F. Wu, and X. Li, “Memory-efficient class-incremental learning for image classification,” IEEE Transactions on Neural Networks and Learning Systems, vol. 33, no. 10, pp. 5966–5977, 2021.
- J. He and F. Zhu, “Exemplar-free online continual learning,” in 2022 IEEE International Conference on Image Processing (ICIP). IEEE, 2022, pp. 541–545.
- D. Abati, J. Tomczak, T. Blankevoort, S. Calderara, R. Cucchiara, and B. E. Bejnordi, “Conditional channel gated networks for task-aware continual learning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 3931–3940.
- Y. Zhao, D. Saxena, and J. Cao, “Memory-efficient domain incremental learning for internet of things,” in Proceedings of the 20th ACM Conference on Embedded Networked Sensor Systems, 2022, pp. 1175–1181.
- A. Pascual-Leone, A. Amedi, F. Fregni, and L. B. Merabet, “The plastic human brain cortex,” Annu. Rev. Neurosci., vol. 28, pp. 377–401, 2005.
- M. V. Johnston, “Plasticity in the developing brain: implications for rehabilitation,” Developmental disabilities research reviews, vol. 15, no. 2, pp. 94–101, 2009.
- M. L. Anderson, “Neural reuse: A fundamental organizational principle of the brain,” Behavioral and brain sciences, vol. 33, no. 4, pp. 245–266, 2010.
- D. Lopez-Paz and M. Ranzato, “Gradient episodic memory for continual learning,” Advances in neural information processing systems, vol. 30, 2017.
- A. Chaudhry, M. Ranzato, M. Rohrbach, and M. Elhoseiny, “Efficient lifelong learning with a-gem,” arXiv preprint arXiv:1812.00420, 2018.
- J. Peng, D. Ye, B. Tang, Y. Lei, Y. Liu, and H. Li, “Lifelong learning with cycle memory networks,” IEEE Transactions on Neural Networks and Learning Systems, pp. 1–14, 2023.
- S. Ho, M. Liu, L. Du, L. Gao, and Y. Xiang, “Prototype-guided memory replay for continual learning,” IEEE Transactions on Neural Networks and Learning Systems, pp. 1–11, 2023.
- Z. Li and D. Hoiem, “Learning without forgetting,” IEEE transactions on pattern analysis and machine intelligence, vol. 40, no. 12, pp. 2935–2947, 2017.
- A. Rosasco, A. Carta, A. Cossu, V. Lomonaco, and D. Bacciu, “Distilled replay: Overcoming forgetting through synthetic samples,” in Continual Semi-Supervised Learning: First International Workshop, CSSL 2021, Virtual Event, August 19–20, 2021, Revised Selected Papers. Springer, 2022, pp. 104–117.
- J. Sun, S. Wang, J. Zhang, and C. Zong, “Distill and replay for continual language learning,” in Proceedings of the 28th international conference on computational linguistics, 2020, pp. 3569–3579.
- S.-A. Rebuffi, A. Kolesnikov, G. Sperl, and C. H. Lampert, “icarl: Incremental classifier and representation learning,” in Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, 2017, pp. 2001–2010.
- K. Binici, S. Aggarwal, N. T. Pham, K. Leman, and T. Mitra, “Robust and resource-efficient data-free knowledge distillation by generative pseudo replay,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, no. 6, 2022, pp. 6089–6096.
- J. Kirkpatrick, R. Pascanu, N. Rabinowitz, J. Veness, G. Desjardins, A. A. Rusu, K. Milan, J. Quan, T. Ramalho, A. Grabska-Barwinska et al., “Overcoming catastrophic forgetting in neural networks,” Proceedings of the national academy of sciences, vol. 114, no. 13, pp. 3521–3526, 2017.
- F. Huszár, “On quadratic penalties in elastic weight consolidation,” arXiv preprint arXiv:1712.03847, 2017.
- Y. Liu, X. Hong, X. Tao, S. Dong, J. Shi, and Y. Gong, “Model behavior preserving for class-incremental learning,” IEEE Transactions on Neural Networks and Learning Systems, vol. 34, no. 10, pp. 7529–7540, 2023.
- J. Schwarz, W. Czarnecki, J. Luketina, A. Grabska-Barwinska, Y. W. Teh, R. Pascanu, and R. Hadsell, “Progress & compress: A scalable framework for continual learning,” in International Conference on Machine Learning. PMLR, 2018, pp. 4528–4537.
- H. Li, P. Barnaghi, S. Enshaeifar, and F. Ganz, “Continual learning using bayesian neural networks,” IEEE transactions on neural networks and learning systems, vol. 32, no. 9, pp. 4243–4252, 2020.
- G.-M. Park, S.-M. Yoo, and J.-H. Kim, “Convolutional neural network with developmental memory for continual learning,” IEEE Transactions on Neural Networks and Learning Systems, vol. 32, no. 6, pp. 2691–2705, 2020.
- J. Serra, D. Suris, M. Miron, and A. Karatzoglou, “Overcoming catastrophic forgetting with hard attention to the task,” in International Conference on Machine Learning. PMLR, 2018, pp. 4548–4557.
- A. Ororbia, A. Mali, C. L. Giles, and D. Kifer, “Continual learning of recurrent neural networks by locally aligning distributed representations,” IEEE Transactions on Neural Networks and Learning Systems, vol. 31, no. 10, pp. 4267–4278, 2020.
- A. A. Rusu, N. C. Rabinowitz, G. Desjardins, H. Soyer, J. Kirkpatrick, K. Kavukcuoglu, R. Pascanu, and R. Hadsell, “Progressive neural networks,” arXiv preprint arXiv:1606.04671, 2016.
- R. Ma, Q. Wu, K. N. Ngan, H. Li, F. Meng, and L. Xu, “Forgetting to remember: A scalable incremental learning framework for cross-task blind image quality assessment,” IEEE Transactions on Multimedia, pp. 1–12, 2023.
- J. Xu and Z. Zhu, “Reinforced continual learning,” Advances in Neural Information Processing Systems, vol. 31, 2018.
- T. Adel, H. Zhao, and R. E. Turner, “Continual learning with adaptive weights (claw),” arXiv preprint arXiv:1911.09514, 2019.
- C. Fernando, D. Banarse, C. Blundell, Y. Zwols, D. Ha, A. A. Rusu, A. Pritzel, and D. Wierstra, “Pathnet: Evolution channels gradient descent in super neural networks,” arXiv preprint arXiv:1701.08734, 2017.
- J. Rajasegaran, M. Hayat, S. Khan, F. S. Khan, and L. Shao, “Random path selection for incremental learning,” Advances in Neural Information Processing Systems, 2019.
- Z. Ke, B. Liu, and X. Huang, “Continual learning of a mixed sequence of similar and dissimilar tasks,” Advances in Neural Information Processing Systems, vol. 33, pp. 18 493–18 504, 2020.
- A. Rosenfeld and J. K. Tsotsos, “Incremental learning through deep adaptation,” IEEE transactions on pattern analysis and machine intelligence, vol. 42, no. 3, pp. 651–663, 2018.
- S. Golkar, M. Kagan, and K. Cho, “Continual learning via neural pruning,” arXiv preprint arXiv:1903.04476, 2019.
- A. Mallya and S. Lazebnik, “Packnet: Adding multiple tasks to a single network by iterative pruning,” in Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, 2018, pp. 7765–7773.
- M. Delange, R. Aljundi, M. Masana, S. Parisot, X. Jia, A. Leonardis, G. Slabaugh, and T. Tuytelaars, “A continual learning survey: Defying forgetting in classification tasks,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021.
- R. Aljundi, F. Babiloni, M. Elhoseiny, M. Rohrbach, and T. Tuytelaars, “Memory aware synapses: Learning what (not) to forget,” in Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 139–154.
- J. Liu, Z. Xu, R. Shi, R. C. Cheung, and H. K. So, “Dynamic sparse training: Find efficient sparse network from scratch with trainable masked layers,” arXiv preprint arXiv:2005.06870, 2020.
- Z. Xu and R. C. Cheung, “Accurate and compact convolutional neural networks with trained binarization,” arXiv preprint arXiv:1909.11366, 2019.
- X. Peng, Q. Bai, X. Xia, Z. Huang, K. Saenko, and B. Wang, “Moment matching for multi-source domain adaptation,” in Proceedings of the IEEE International Conference on Computer Vision, 2019, pp. 1406–1415.
- L. Bottou et al., “Stochastic gradient learning in neural networks,” Proceedings of Neuro-Nımes, vol. 91, no. 8, p. 12, 1991.