Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
120 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Solving the Catastrophic Forgetting Problem in Generalized Category Discovery (2501.05272v2)

Published 9 Jan 2025 in cs.CV

Abstract: Generalized Category Discovery (GCD) aims to identify a mix of known and novel categories within unlabeled data sets, providing a more realistic setting for image recognition. Essentially, GCD needs to remember existing patterns thoroughly to recognize novel categories. Recent state-of-the-art method SimGCD transfers the knowledge from known-class data to the learning of novel classes through debiased learning. However, some patterns are catastrophically forgot during adaptation and thus lead to poor performance in novel categories classification. To address this issue, we propose a novel learning approach, LegoGCD, which is seamlessly integrated into previous methods to enhance the discrimination of novel classes while maintaining performance on previously encountered known classes. Specifically, we design two types of techniques termed as Local Entropy Regularization (LER) and Dual-views Kullback Leibler divergence constraint (DKL). The LER optimizes the distribution of potential known class samples in unlabeled data, thus ensuring the preservation of knowledge related to known categories while learning novel classes. Meanwhile, DKL introduces Kullback Leibler divergence to encourage the model to produce a similar prediction distribution of two view samples from the same image. In this way, it successfully avoids mismatched prediction and generates more reliable potential known class samples simultaneously. Extensive experiments validate that the proposed LegoGCD effectively addresses the known category forgetting issue across all datasets, eg, delivering a 7.74% and 2.51% accuracy boost on known and novel classes in CUB, respectively. Our code is available at: https://github.com/Cliffia123/LegoGCD.

Summary

  • The paper introduces LegoGCD, a novel method addressing catastrophic forgetting in Generalized Category Discovery (GCD) models like SimGCD.
  • LegoGCD employs Local Entropy Regularization (LER) to preserve known class knowledge and a Dual-views Kullback-Leibler Divergence Constraint (DKL) for predictive consistency.
  • Experimental results show LegoGCD significantly improves known class accuracy on various datasets, enhancing model stability for real-world dynamic environments.

Addressing Catastrophic Forgetting in Generalized Category Discovery

The paper "Solving the Catastrophic Forgetting Problem in Generalized Category Discovery" presents a comprehensive paper on Generalized Category Discovery (GCD) and introduces a novel approach, termed LegoGCD, to mitigate the issue of catastrophic forgetting. Catastrophic forgetting is a common problem in continual learning models, where previously learned information is lost during the acquisition of new knowledge. This paper is co-authored by experts from institutions like Sun Yat-sen University and Peking University, among others, and offers a significant contribution to the field of GCD.

Problem Context and Motivation

GCD is an advanced setting in machine learning where the goal is to identify both known and novel categories within unlabeled datasets. Traditional methods often suffer from a decline in the recognition ability of known categories as models adapt to new classes, a problem referred to as catastrophic forgetting. The authors highlight this issue in SimGCD, a state-of-the-art method for GCD, which demonstrates a decrease in accuracy for known categories when focusing on novel ones. This phenomenon is problematic as it degrades the model's overall performance and utility in practical applications.

Methodology: LegoGCD

To address this challenge, the authors propose LegoGCD, an innovative approach that incorporates two key techniques:

  1. Local Entropy Regularization (LER): LER aims to preserve the knowledge of known classes during the training process. It utilizes a local entropy-based strategy applied to high-confidence known samples within an unlabeled dataset. By doing so, LER ensures that the model retains its ability to classify known categories accurately while learning new classes.
  2. Dual-views Kullback-Leibler Divergence Constraint (DKL): To support LER, DKL is introduced to align the predictive distribution of two views from the same image. This constraint helps the model maintain consistency between different augmented views of a sample, enhancing the reliability of sample selection for the LER technique.

LegoGCD integrates these components into the existing SimGCD framework, essentially building upon it like Lego blocks, hence the name. This integration is seamless, requiring minimal modification to the original framework and adding no additional parameters.

Experimental Evaluation

The efficacy of LegoGCD is validated through extensive experiments on several datasets, including CIFAR10/100, ImageNet, and various fine-grained datasets such as CUB and Stanford Cars. The results indicate substantial improvements, particularly in known class accuracy. For instance, on the CUB dataset, LegoGCD achieves a 72.18% accuracy on known classes, outperforming SimGCD by 7.74%. This demonstrates LegoGCD's significant capability in addressing catastrophic forgetting and enhancing model stability across both known and novel categories.

Implications and Future Directions

The implications of this research are substantial for both theoretical and practical advancements in AI systems. By effectively alleviating catastrophic forgetting, LegoGCD makes a compelling case for its application in real-world scenarios where models are expected to operate in dynamic environments with both known and emerging categories. The integration of LER and DKL may serve as a blueprint for future developments in this domain, encouraging further exploration of entropy-based regularizations and consistency constraints in continual learning frameworks.

Moving forward, this approach could be extended to even more complex environments and integrated into diverse applications beyond image classification. The adaptability and minimal overhead of LegoGCD suggest potential for its use in enhancing neural network robustness in various AI applications. Moreover, the foundational techniques proposed warrant deeper theoretical exploration to understand their possible intersections with other fields such as reinforcement learning and domain adaptation.

In conclusion, this paper provides valuable insights and a practical solution to a persistent problem in machine learning, contributing to the advancement of GCD methodologies and encouraging further research in this dynamic field.