Modeling the Background for Incremental Learning in Semantic Segmentation (2002.00718v2)

Published 3 Feb 2020 in cs.CV

Abstract: Despite their effectiveness in a wide range of tasks, deep architectures suffer from some important limitations. In particular, they are vulnerable to catastrophic forgetting, i.e. they perform poorly when they are required to update their model as new classes are available but the original training set is not retained. This paper addresses this problem in the context of semantic segmentation. Current strategies fail on this task because they do not consider a peculiar aspect of semantic segmentation: since each training step provides annotation only for a subset of all possible classes, pixels of the background class (i.e. pixels that do not belong to any other classes) exhibit a semantic distribution shift. In this work we revisit classical incremental learning methods, proposing a new distillation-based framework which explicitly accounts for this shift. Furthermore, we introduce a novel strategy to initialize classifier's parameters, thus preventing biased predictions toward the background class. We demonstrate the effectiveness of our approach with an extensive evaluation on the Pascal-VOC 2012 and ADE20K datasets, significantly outperforming state of the art incremental learning methods.

Authors (5)

Fabio Cermelli (22 papers)
Massimiliano Mancini (66 papers)
Samuel Rota Bulò (45 papers)
Elisa Ricci (137 papers)
Barbara Caputo (105 papers)

Citations (252)

View on Semantic Scholar

Summary

Incremental Learning in Semantic Segmentation: Addressing the Background Shift

The paper "Modeling the Background for Incremental Learning in Semantic Segmentation" addresses the critical issue of catastrophic forgetting in the context of semantic segmentation. The authors identify a fundamental challenge intrinsic to incremental learning for semantic segmentation tasks: the dynamic nature of the background class that leads to semantic distribution shifts across different incremental learning steps.

Key Contributions

Novel Objective Function: The paper proposes an innovative distillation-based framework that accounts explicitly for the semantic distribution shifts inherent in the background class. This includes modifications to the traditional cross-entropy and distillation losses, thereby addressing a core limitation in existing methods that overlook the evolving semantics of the background during incremental learning phases.
Focused Initialization Strategy: A strategic initialization method is introduced for new classifiers, which aligns the weight of new classes with that of the background. This mitigates potential biases in predictions towards the background class, thereby facilitating a more stable learning trajectory when new classes are introduced.
Empirical Validation: The effectiveness of the proposed approach, termed MiB, is substantiated through evaluations on widely recognized datasets, Pascal-VOC 2012 and ADE20K. The approach consistently outperforms existing state-of-the-art incremental learning methods, showcasing significant improvements in maintaining knowledge of old classes while adeptly integrating knowledge from new ones.

Methodological Insights

The primary innovation lies in the methodology for modeling semantic shifts in the background class, which differs from conventional approaches utilized in image classification incremental learning. By modifying the cross-entropy loss, the proposed method ensures that the background is not disproportionately favored over previously learned classes. This is further reinforced by a carefully designed distillation loss that emphasizes consistency between the old and new model outputs, aligning the learning process with the inherent dynamics of semantic shifts.

Experimental Outcomes

The results are compelling, demonstrating superior performance over competitive baselines such as LwF and ILT across multiple scenarios, including single and multi-step class addition in both disjoint and overlapped setups. Particularly on the challenging ADE20K dataset, MiB establishes a new benchmark by significantly reducing the impact of forgetting while achieving comparable accuracy to offline joint training models.

Implications and Future Directions

The paper's insights extend the applicability of incremental learning to more complex domains where the background class is subject to change. This holds profound implications for real-world applications such as autonomous driving and medical imaging, where new object categories are continuously encountered without revisiting past data. The methodology could be further enriched by exploring more sophisticated initialization techniques and loss functions specially tailored for diverse data modalities.

In conclusion, this research represents a crucial step forward in the incremental learning field for semantic segmentation. By addressing the issue of background semantic shift with a robust methodological framework, it lays the groundwork for more resilient AI systems capable of adaptive learning in dynamic environments.

PDF Markdown