S-Prompts Learning with Pre-trained Transformers: An Occam's Razor for Domain Incremental Learning (2207.12819v2)

Published 26 Jul 2022 in cs.CV and cs.LG

Abstract: State-of-the-art deep neural networks are still struggling to address the catastrophic forgetting problem in continual learning. In this paper, we propose one simple paradigm (named as S-Prompting) and two concrete approaches to highly reduce the forgetting degree in one of the most typical continual learning scenarios, i.e., domain increment learning (DIL). The key idea of the paradigm is to learn prompts independently across domains with pre-trained transformers, avoiding the use of exemplars that commonly appear in conventional methods. This results in a win-win game where the prompting can achieve the best for each domain. The independent prompting across domains only requests one single cross-entropy loss for training and one simple K-NN operation as a domain identifier for inference. The learning paradigm derives an image prompt learning approach and a novel language-image prompt learning approach. Owning an excellent scalability (0.03% parameter increase per domain), the best of our approaches achieves a remarkable relative improvement (an average of about 30%) over the best of the state-of-the-art exemplar-free methods for three standard DIL tasks, and even surpasses the best of them relatively by about 6% in average when they use exemplars. Source code is available at \url{https://github.com/iamwangyabin/S-Prompts}.

PDF Abstract

Overview of S-Prompts Learning with Pre-trained Transformers

The paper "S-Prompts Learning with Pre-trained Transformers: An Occam’s Razor for Domain Incremental Learning" presents an innovative approach to domain incremental learning (DIL) that addresses the persistent challenge of catastrophic forgetting. In continual learning scenarios, particularly DIL, state-of-the-art deep neural networks often fail to retain previously learned knowledge when exposed to new tasks sequentially. This paper introduces the concept of S-Prompting, a paradigm that structures the use of pre-trained transformers with specialized prompts to mitigate forgetting across domains without exemplar storage.

Key Contributions

Independent Prompting Paradigm:
- The central idea of S-Prompting involves decoupling the learning of prompts across different domains. This is contrasted against traditional methods, which often depend on shared knowledge across tasks. By leveraging the independent learning of prompts, the model can specialize in each domain without the interference from previously learned tasks, thereby reducing catastrophic forgetting.
Efficiency and Scalability:
- The paradigm only incurs a minor increase in model parameters, about 0.03% per domain, making it highly scalable to a large number of domains. The authors demonstrate that this minimal additional computational overhead significantly improves the model's ability to retain task-specific features.
Superior Performance:
- Empirical results show a substantial improvement in accuracy (approximately 30% relative improvement) over existing state-of-the-art exemplar-free methods in DIL tasks. S-Prompts also outperform methods that utilize exemplars by around 6% on average. These results are observed across datasets such as CDDB-Hard, CORe50, and DomainNet, underscoring the approach's broad applicability and effectiveness.
Applications of Pre-trained Transformers:
- The method utilizes pre-trained vision transformers (ViT) and Contrastive Language-Image Pre-training (CLIP) frameworks. By fixing these powerful architectures and only tuning the associated prompts, S-Prompting capitalizes on the strengths of transformers in generating robust feature representations while avoiding overfitting to specific domains.

Implications and Future Directions

The S-Prompts approach introduces a novel and efficient methodology for addressing the challenge of continual learning in highly varied domains. It paves the way for future exploration into independent learning mechanisms, especially in tasks where domain variance is significant. Moreover, the use of prompt-based learning can inspire advancements in other fields like zero-shot and few-shot learning, where task-specific prompt engineering may provide significant advantages.

As the field of continual learning evolves, particularly with the ubiquitous influence of transformers, S-Prompts might act as a foundational approach in designing systems that require both adaptability and retention. Moreover, this research could stimulate further paper into balancing the trade-offs between knowledge retention and model scaling, particularly as more domains and data are made available. Further exploration could also assess the integration of this paradigm with other forms of learning and adaptation technologies, expanding its applicability across various machine learning applications.

In conclusion, the S-Prompts learning approach offers a fresh perspective and potential solution to the pressing issue of catastrophic forgetting in domain incremental learning. By rethinking the use of prompts within pre-trained transformer networks, this work provides compelling evidence of the effectiveness of independent feature learning across domains, signifying a meaningful contribution to the landscape of continual learning research.

PDF Markdown Bookmark Chat (Pro)

Authors (3)

Yabin Wang (14 papers)
Zhiwu Huang (41 papers)
Xiaopeng Hong (59 papers)

Citations (160)

View on Semantic Scholar

Related Papers

Find Related Papers

GitHub

GitHub - iamwangyabin/S-Prompts: Code for NeurIPS 2022 paper “S-Prompts Learning with Pre-trained Transformers: An Occam’s Razor for Domain Incremental Learning“ (85 stars)