Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
133 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Less-forgetting Learning in Deep Neural Networks (1607.00122v1)

Published 1 Jul 2016 in cs.LG

Abstract: A catastrophic forgetting problem makes deep neural networks forget the previously learned information, when learning data collected in new environments, such as by different sensors or in different light conditions. This paper presents a new method for alleviating the catastrophic forgetting problem. Unlike previous research, our method does not use any information from the source domain. Surprisingly, our method is very effective to forget less of the information in the source domain, and we show the effectiveness of our method using several experiments. Furthermore, we observed that the forgetting problem occurs between mini-batches when performing general training processes using stochastic gradient descent methods, and this problem is one of the factors that degrades generalization performance of the network. We also try to solve this problem using the proposed method. Finally, we show our less-forgetting learning method is also helpful to improve the performance of deep neural networks in terms of recognition rates.

Citations (218)

Summary

  • The paper introduces a novel less-forgetting algorithm that minimizes catastrophic forgetting in DNNs by balancing source retention and new learning using combined cross-entropy and Euclidean loss.
  • Empirical results on CIFAR-10, MNIST, and SVHN demonstrate competitive recognition rates and robustness against significant domain shifts.
  • The approach effectively mitigates both mini-batch and incremental forgetting, paving the way for improved continual learning and domain adaptation in AI.

An Examination of Less-forgetting Learning in Deep Neural Networks

Heechul Jung et al.'s paper, "Less-forgetting Learning in Deep Neural Networks," presents solutions to address catastrophic forgetting in Deep Neural Networks (DNNs), a significant issue when DNNs are trained incrementally or in new contexts such as domain adaptation. The authors introduce a novel method that mitigates catastrophic forgetting without the need for access to the original source domain data.

The methodology centers on retaining the knowledge learned from the source domain while accommodating new information from the target domain. This is achieved by leveraging two key properties: maintaining decision boundaries and ensuring that feature representations remain close to those initially learned. Their less-forgetting (LF) algorithm combines stochastic gradient descent (SGD) with a loss function that balances between cross-entropy and Euclidean loss, aiming to preserve the original features while learning new ones. This method has shown efficacy in not only maintaining information from previously learned data but also improving generalization performance across experiments with traditional and transformed datasets.

The paper provides a robust empirical analysis, applying the LF method to established datasets such as CIFAR-10 and MNIST versus SVHN. The results obtained illustrate a noticeable reduction in the forgetting rate compared to traditional transfer learning and other methods like LWTA and Maxout. Specifically, for CIFAR-10, their approach yields a superior balance of retaining source information and acquiring target-specific knowledge, illustrated by competitive recognition rates even for domains with significant format conversions, such as color to grayscale.

An insightful observation made by the authors is the occurrence of forgetting even within the mini-batch training process itself. The proposed modification of the LF method adapts to this scenario by alternating between retaining existing knowledge and incorporating new inputs at set intervals, thereby smoothing the learning curve.

The paper suggests significant implications for future work in AI, where DNNs must efficiently learn across varied data distributions without compromising previously acquired knowledge. The less-forgetting approach could be pivotal in contexts requiring continual learning or where datasets evolve dynamically.

Overall, Jung et al. provide a meticulous exploration into mitigating information loss in DNNs, with promising implications for domain adaptation and incremental learning. The research sets the stage for further exploration into advanced learning strategies that balance information retention and acquisition, a crucial aspect as the ambitions for AI continue to expand into more complex and diverse environments.