Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Kaizen: Practical Self-supervised Continual Learning with Continual Fine-tuning (2303.17235v2)

Published 30 Mar 2023 in cs.LG

Abstract: Self-supervised learning (SSL) has shown remarkable performance in computer vision tasks when trained offline. However, in a Continual Learning (CL) scenario where new data is introduced progressively, models still suffer from catastrophic forgetting. Retraining a model from scratch to adapt to newly generated data is time-consuming and inefficient. Previous approaches suggested re-purposing self-supervised objectives with knowledge distillation to mitigate forgetting across tasks, assuming that labels from all tasks are available during fine-tuning. In this paper, we generalize self-supervised continual learning in a practical setting where available labels can be leveraged in any step of the SSL process. With an increasing number of continual tasks, this offers more flexibility in the pre-training and fine-tuning phases. With Kaizen, we introduce a training architecture that is able to mitigate catastrophic forgetting for both the feature extractor and classifier with a carefully designed loss function. By using a set of comprehensive evaluation metrics reflecting different aspects of continual learning, we demonstrated that Kaizen significantly outperforms previous SSL models in competitive vision benchmarks, with up to 16.5% accuracy improvement on split CIFAR-100. Kaizen is able to balance the trade-off between knowledge retention and learning from new data with an end-to-end model, paving the way for practical deployment of continual learning systems.

Introduction to Continual Learning Framework

Continual Learning (CL) presents a significant advance in artificial intelligence, with the goal to build systems capable of learning from a continuous stream of data, adapting to new patterns without forgetting previously acquired knowledge. Self-Supervised Learning (SSL) methods have been highlighted for their ability to utilize unlabelled data effectively, achieving impressive performance in vision tasks. Traditional CL methods primarily focus on supervised setups, often subject to limitations like data labeling and availability.

The merge of SSL paradigms with CL is still an evolving field. In recent developments, a more practical perspective to CL has been proposed through a new framework known as "Kaizen," a term inspired by the concept of continuous improvement. This research suggests a novel architecture addressing the real-world challenges of adapting SSL in the context of CL.

Generalizing Self-Supervised Continual Learning

The paper's central motivation rests on refining the practicality of Self-Supervised Continual Learning (CSSL). The existing CSSL models assume accessibility and leveraging of labels across tasks for fine-tuning, which misaligns with practical considerations such as limited data availability due to privacy and resource constraints. Kaizen proposes an improved CSSL framework that introduces a new training architecture apt for practical deployment. It is capable of mitigating catastrophic forgetting for both the feature extractor and classifier components.

Fundamental to Kaizen is its distinctive loss function formulation, designed to facilitate simultaneous self-supervised and supervised fine-tuning, thereby bridging the gap between feature extraction and classifier refinement.

Evaluating Practical Deployment of CL Systems

The Kaizen framework underwent thorough evaluation against competitive vision benchmarks. By employing a comprehensive range of evaluation metrics, it was demonstrated that Kaizen notably improves accuracy in SSL-based models across various settings compared to current methods. Notably, Kaizen allows for the integration of different SSL techniques into its architecture, thereby ensuring flexibility and robust handling of catastrophic forgetting. The analysis also included a thorough investigation into scenarios with extended periods of continual learning, revealing the intricate trade-offs between knowledge retention and the learning of new tasks.

Distinctive Features of Kaizen Architecture

Kaizen leverages both labelled and unlabelled data during the SSL process, offering flexibility in pre-training and fine-tuning phases. The key contributions of this research include introducing a practical CL framework deployable at any training stage, devising a novel evaluation setup reflective of real-life applications focusing on the overall continual learning process, and showcasing through empirical analysis Kaizen's robustness to catastrophic forgetting.

Conclusion and Future Horizons

The results underscore Kaizen's strength, significantly outperforming previous self-supervised models in terms of overall performance and ability to retain knowledge. This paves the path for practical deployment and utilization of continual learning systems across dynamic and real-life contexts. The framework introduced by this research holds the potential for significant advancements in AI and computer vision.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Chi Ian Tang (11 papers)
  2. Lorena Qendro (8 papers)
  3. Dimitris Spathis (35 papers)
  4. Fahim Kawsar (32 papers)
  5. Cecilia Mascolo (86 papers)
  6. Akhil Mathur (21 papers)
Citations (9)