Parameter Efficient Mamba Tuning via Projector-targeted Diagonal-centric Linear Transformation (2411.15224v3)

Published 21 Nov 2024 in cs.LG and cs.AI

Abstract: Despite the growing interest in Mamba architecture as a potential replacement for Transformer architecture, parameter-efficient fine-tuning (PEFT) approaches for Mamba remain largely unexplored. In our study, we introduce two key insights-driven strategies for PEFT in Mamba architecture: (1) While state-space models (SSMs) have been regarded as the cornerstone of Mamba architecture, then expected to play a primary role in transfer learning, our findings reveal that Projectors -- not SSMs -- are the predominant contributors to transfer learning. (2) Based on our observation, we propose a novel PEFT method specialized to Mamba architecture: Projector-targeted Diagonal-centric Linear Transformation (ProDiaL). ProDiaL focuses on optimizing only the pretrained Projectors for new tasks through diagonal-centric linear transformation matrices, without directly fine-tuning the Projector weights. This targeted approach allows efficient task adaptation, utilizing less than 1% of the total parameters, and exhibits strong performance across both vision and language Mamba models, highlighting its versatility and effectiveness.

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Generate Now

Related Papers

Mamba: Linear-Time Sequence Modeling with Selective State Spaces (2023)
Mamba State-Space Models Are Lyapunov-Stable Learners (2024)
An Empirical Study of Mamba-based Language Models (2024)
MambaVision: A Hybrid Mamba-Transformer Vision Backbone (2024)
A Survey of Mamba (2024)

Parameter Efficient Mamba Tuning via Projector-targeted Diagonal-centric Linear Transformation (2411.15224v3)

Summary

Follow-up Questions

Related Papers

Authors (4)