Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

DualPrompt: Complementary Prompting for Rehearsal-free Continual Learning (2204.04799v2)

Published 10 Apr 2022 in cs.LG and cs.CV
DualPrompt: Complementary Prompting for Rehearsal-free Continual Learning

Abstract: Continual learning aims to enable a single model to learn a sequence of tasks without catastrophic forgetting. Top-performing methods usually require a rehearsal buffer to store past pristine examples for experience replay, which, however, limits their practical value due to privacy and memory constraints. In this work, we present a simple yet effective framework, DualPrompt, which learns a tiny set of parameters, called prompts, to properly instruct a pre-trained model to learn tasks arriving sequentially without buffering past examples. DualPrompt presents a novel approach to attach complementary prompts to the pre-trained backbone, and then formulates the objective as learning task-invariant and task-specific "instructions". With extensive experimental validation, DualPrompt consistently sets state-of-the-art performance under the challenging class-incremental setting. In particular, DualPrompt outperforms recent advanced continual learning methods with relatively large buffer sizes. We also introduce a more challenging benchmark, Split ImageNet-R, to help generalize rehearsal-free continual learning research. Source code is available at https://github.com/google-research/l2p.

DualPrompt: Complementary Prompting for Rehearsal-Free Continual Learning

The paper "DualPrompt: Complementary Prompting for Rehearsal-Free Continual Learning" presents a novel approach to addressing the problem of catastrophic forgetting in continual learning (CL) without relying on rehearsal buffers. This method is particularly relevant for settings where privacy and memory constraints are crucial, such as in real-world deployment. The innovation of the paper lies in the DualPrompt framework that consists of two types of prompts: G-Prompt and E-Prompt, which serve to harness task-invariant and task-specific knowledge, respectively.

Background

Continual learning aims to enable models to learn from a stream of tasks, avoiding the typical degradation in performance on previously learned tasks known as catastrophic forgetting. Many previous approaches have employed rehearsal buffers to retain past data, enabling experience replay which often comes with privacy concerns and significant memory requirements. Other approaches include regularization and architecture-based methods, each with their limitations.

DualPrompt Framework

DualPrompt sets a new direction by employing a prompting mechanism within a pre-trained transformer backbone. This method attaches complementary prompts to the model:

  • G-Prompt (General Prompt): Shared across tasks, G-Prompt captures and instructs the model to leverage task-invariant features.
  • E-Prompt (Expert Prompt): Task-specific prompts are applied to learn distinct task-specific features, which are activated through a key-query matching mechanism.

By decoupling these prompts at different layers of the neural network, the DualPrompt method is able to facilitate better representation and retain previously learned information without requiring rehearsal of past data.

Numerical Results

Extensive experimental validation demonstrates that DualPrompt outperforms state-of-the-art rehearsal-based and rehearsal-free methods in a challenging class-incremental setting:

  • On benchmarks like Split CIFAR-100 and the newly introduced Split ImageNet-R, DualPrompt achieved superior average accuracy and reduced forgetting levels compared to methods requiring rehearsal buffers of up to 5000 images.
  • It effectively surpasses other methods that do not use rehearsal, such as L2P, showcasing an improvement of 3%-7% in average accuracy.

Implications and Future Work

DualPrompt provides significant theoretical and practical advancements for developing continual learning systems that are privacy-preserving and memory-efficient. The use of pre-trained models allows the system to be both lightweight and high-performing, setting the stage for further research into modular and prompt-based learning strategies.

Future developments can explore enhancing the task-query mechanisms and integrating other prompting functions or architectures. Additionally, refining key-instruction matching and examining the impacts of different types of network architectures beyond transformers can provide further insights and improvements in this domain.

Conclusion

The DualPrompt framework presents a significant step forward in the rehearsal-free continual learning field, demonstrating how thoughtfully designed prompting mechanisms can alleviate forgetting without the need for extensive memory storage for past data. Its success highlights the importance and potential of leveraging pre-trained models within the CL context, a trend likely to influence future research and applications in artificial intelligence.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (11)
  1. Zifeng Wang (78 papers)
  2. Zizhao Zhang (44 papers)
  3. Sayna Ebrahimi (27 papers)
  4. Ruoxi Sun (58 papers)
  5. Han Zhang (338 papers)
  6. Chen-Yu Lee (48 papers)
  7. Xiaoqi Ren (8 papers)
  8. Guolong Su (12 papers)
  9. Vincent Perot (14 papers)
  10. Jennifer Dy (46 papers)
  11. Tomas Pfister (89 papers)
Citations (352)