Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Neural Collapse Inspired Feature-Classifier Alignment for Few-Shot Class Incremental Learning (2302.03004v1)

Published 6 Feb 2023 in cs.CV and cs.LG

Abstract: Few-shot class-incremental learning (FSCIL) has been a challenging problem as only a few training samples are accessible for each novel class in the new sessions. Finetuning the backbone or adjusting the classifier prototypes trained in the prior sessions would inevitably cause a misalignment between the feature and classifier of old classes, which explains the well-known catastrophic forgetting problem. In this paper, we deal with this misalignment dilemma in FSCIL inspired by the recently discovered phenomenon named neural collapse, which reveals that the last-layer features of the same class will collapse into a vertex, and the vertices of all classes are aligned with the classifier prototypes, which are formed as a simplex equiangular tight frame (ETF). It corresponds to an optimal geometric structure for classification due to the maximized Fisher Discriminant Ratio. We propose a neural collapse inspired framework for FSCIL. A group of classifier prototypes are pre-assigned as a simplex ETF for the whole label space, including the base session and all the incremental sessions. During training, the classifier prototypes are not learnable, and we adopt a novel loss function that drives the features into their corresponding prototypes. Theoretical analysis shows that our method holds the neural collapse optimality and does not break the feature-classifier alignment in an incremental fashion. Experiments on the miniImageNet, CUB-200, and CIFAR-100 datasets demonstrate that our proposed framework outperforms the state-of-the-art performances. Code address: https://github.com/NeuralCollapseApplications/FSCIL

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Yibo Yang (80 papers)
  2. Haobo Yuan (22 papers)
  3. Xiangtai Li (128 papers)
  4. Zhouchen Lin (158 papers)
  5. Philip Torr (172 papers)
  6. Dacheng Tao (829 papers)
Citations (80)

Summary

  • The paper introduces a fixed ETF-inspired alignment framework that mitigates feature-classifier misalignment in few-shot incremental learning.
  • It employs a novel Dot-Regression loss to fine-tune only the projection layer using both novel class samples and stored old class means.
  • Empirical evaluations on miniImageNet, CIFAR-100, and CUB-200 show significantly enhanced average accuracy and reduced performance degradation.

Analysis of "Neural Collapse Inspired Feature-Classifier Alignment for Few-Shot Class Incremental Learning"

In the domain of incremental learning where neural networks must assimilate new data without catastrophic forgetting of previously learned information, Few-Shot Class-Incremental Learning (FSCIL) presents a unique challenge due to severe class imbalance and the limited samples available for novel classes. The paper "Neural Collapse Inspired Feature-Classifier Alignment for Few-Shot Class Incremental Learning" explores this problem, proposing a framework that leverages the phenomenon of neural collapse to maintain feature-classifier alignment across incremental sessions.

Theoretical Background

The phenomenon of neural collapse is characterized by the convergence of within-class feature means to 'vertexes' of a simplex Equiangular Tight Frame (ETF) at the latter stage of training. This structure maximizes the Fisher Discriminant Ratio, thus optimizing classification performance. Existing incremental learning strategies often modify classifier prototypes in response to new data, leading to feature-classifier misalignment and the resultant forgetting of old classes. This paper's approach seeks to predefine an optimal alignment inspired by neural collapse, mitigating the dilemma associated with adapting to new data while maintaining previous knowledge.

Methodology

The framework proposed involves pre-assigning an ETF-inspired structure across the complete label space—both base and incrementally added classes. This structure remains fixed through all training sessions. The neural network's backbone is initially trained and a projection layer is incorporated to adjust feature representations to align them with the ETF classifier. A novel loss function, termed Dot-Regression (DR) loss, is employed to fine-tune features towards their respective target prototypes without altering the classifier, thus reducing the risk of misalignment across sessions.

Throughout training, only the projection layer is finetuned using a mix of novel class samples and stored mean features of old classes, maintained in a memory module. This choice preserves the alignment mandated by neural collapse optimality and prevents any perturbation induced by incremental training.

Empirical Evaluation

Experimental analysis on benchmark datasets—miniImageNet, CIFAR-100, and CUB-200—demonstrates that the ETF classifier coupled with DR loss significantly reduces feature-classifier misalignment. The results indicate superior performance against state-of-the-art methods, particularly in mitigating performance degradation across sessions, a common issue in FSCIL scenarios.

The authors highlight a remarkable enhancement over existing baselines, with notable improvements in average accuracy across all sessions. This data substantiates the effectiveness of maintaining a consistent geometric structure as prescribed by neural collapse throughout incremental learning.

Implications and Future Directions

The implications of this research are manifold. The alignment principle inspired by neural collapse, when applied to FSCIL, yields substantial gains in maintaining a balanced classification performance despite data scarcity and imbalance. This approach suggests that enforcing a fixed optimal feature-classifier geometry could be an effective strategy for other forms of lifelong learning and model adaptation tasks.

Future explorations could evaluate the scalability of the proposed alignment across a broader spectrum of applications, such as in heterogeneous or unbalanced data scenarios more complex than those addressed in this paper. Moreover, extending the findings to unsupervised or semi-supervised learning paradigms may offer additional insights into potential applications.

In summary, the paper presents a compelling case for utilizing neural collapse as a foundational principle in designing robust incremental learning frameworks. This contribution to the field of machine learning highlights a path forward in reconciling the exigencies of continual adaptation and retention, setting a new standard for FSCIL methodologies.