Continual Learning with Low Rank Adaptation (2311.17601v1)

Published 29 Nov 2023 in cs.LG and cs.AI

Abstract: Recent work using pretrained transformers has shown impressive performance when fine-tuned with data from the downstream problem of interest. However, they struggle to retain that performance when the data characteristics changes. In this paper, we focus on continual learning, where a pre-trained transformer is updated to perform well on new data, while retaining its performance on data it was previously trained on. Earlier works have tackled this primarily through methods inspired from prompt tuning. We question this choice, and investigate the applicability of Low Rank Adaptation (LoRA) to continual learning. On a range of domain-incremental learning benchmarks, our LoRA-based solution, CoLoR, yields state-of-the-art performance, while still being as parameter efficient as the prompt tuning based methods.

Authors (4)

Martin Wistuba (30 papers)
Prabhu Teja Sivaprasad (7 papers)
Lukas Balles (17 papers)
Giovanni Zappella (28 papers)

Citations (6)

View on Semantic Scholar

Summary

Exploring the Efficacy of Low Rank Adaptation in Continual Learning

Introduction to Continual Learning Challenges

Continual learning, a critical component of AI aiming to emulate human-like learning by iteratively updating knowledge, faces significant hurdles in modern machine learning systems. These systems often succumb to catastrophic forgetting when updated with new data - a phenomenon where model performance degrades on previously learned data. The introduction of pre-trained transformers brought a glimmer of hope to this endeavor, albeit not entirely solving the problem. The traditional avenue to address continual learning has primarily observed through the lens of prompt tuning, an approach that, while parameter-efficient, shows limitations in retaining or improving model performance across successive data updates.

Revisiting the Paradigm: Low Rank Adaptation

This paper presents a paradigm shift from prompt tuning to Low Rank Adaptation (LoRA), demonstrating its superior capability in facilitating continual learning. The method introduced, Continual Low Rank Adaptation (CoLoR), extends the existing state-of-the-art for domain-incremental learning by incorporating LoRA, thereby maintaining parameter efficiency while achieving significant improvements in predictive performance. This paper argues that while existing methods have leaned toward prompt tuning for its parameter efficiency, LoRA-based methods, evidenced by empirical evaluations, offer a compelling alternative that balances efficiency with enhanced performance.

Empirical Validations and Findings

The researchers conducted extensive experiments across different continual learning paradigms, including domain-incremental, class-incremental, and task-incremental learning. Using benchmarks like CORe50, DomainNet, and Split CIFAR-100, CoLoR was pitted against prompt tuning-based methods and others. The results were illuminating:

CoLoR outperformed the closest competing memory-free method by 2% on CORe50 and a significant 19% on DomainNet, likewise displaying competitive or superior results against replay-based methods.
In the class-incremental learning scenario, CoLoR also yielded improvements over S-Prompts, though it initially lagged behind L2P. An iteration, CoLoR++, was introduced to address this, leveraging better data representations for dataset identification, thus enhancing overall results.
Remarkably, CoLoR maintained the parameter efficiency of prompt-based methods while delivering superior performance. This balance underscores the potential of LoRA in fine-tuning pre-trained models for continual learning tasks.

Implications and Future Prospects

The findings of this paper have profound implications for the advancement of continual learning strategies:

Methodological Shift: The success of CoLoR encourages a reevaluation of the dependence on prompt tuning for continual learning, suggesting that LoRA presents an effective alternative path that doesn't compromise parameter efficiency.
Practical Deployability: CoLoR's effectiveness and efficiency make it a viable option for real-world applications requiring models that adapt over time without the need for extensive retraining or parameter expansion.
Future Methodological Enhancements: The introduction of CoLoR++ points toward continuous methodological improvements. This iterative approach to enhancing model performance through better representation and fine-tuning mechanisms could spearhead further research in this direction.
Bridging Learning Paradigms: This research also highlights the potential in narrowing the gap between different increments of learning (DIL, CIL, TIL) by leveraging efficient fine-tuning methods, an area ripe for future exploration.

Conclusion

In summary, this paper successfully challenges the prevailing norm of prompt tuning in continual learning by introducing and validating the efficacy of Low Rank Adaptation through CoLoR. By achieving state-of-the-art results in domain-incremental and class-incremental learning benchmarks while retaining parameter efficiency, it sets a new precedent for future continual learning research and applications. The implications of this research are far-reaching, not only in enhancing the performance of AI models but also in our understanding of efficient, continual learning mechanisms.