Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
89 tokens/sec
Gemini 2.5 Pro Premium
41 tokens/sec
GPT-5 Medium
23 tokens/sec
GPT-5 High Premium
19 tokens/sec
GPT-4o
96 tokens/sec
DeepSeek R1 via Azure Premium
88 tokens/sec
GPT OSS 120B via Groq Premium
467 tokens/sec
Kimi K2 via Groq Premium
197 tokens/sec
2000 character limit reached

Parameter-Efficient Fine-Tuning via Selective Discrete Cosine Transform (2410.09103v1)

Published 9 Oct 2024 in cs.LG and cs.AI

Abstract: In the era of LLMs, parameter-efficient fine-tuning (PEFT) has been extensively studied. However, these approaches usually rely on the space domain, which encounters storage challenges especially when handling extensive adaptations or larger models. The frequency domain, in contrast, is more effective in compressing trainable parameters while maintaining the expressive capability. In this paper, we propose a novel Selective Discrete Cosine Transformation (sDCTFT) fine-tuning scheme to push this frontier. Its general idea is to exploit the superior energy compaction and decorrelation properties of DCT to improve both model efficiency and accuracy. Specifically, it projects the weight change from the low-rank adaptation into the discrete cosine space. Then, the weight change is partitioned over different levels of the discrete cosine spectrum, and the most critical frequency components in each partition are selected. Extensive experiments on four benchmark datasets demonstrate the superior accuracy, reduced computational cost, and lower storage requirements of the proposed method over the prior arts. For instance, when performing instruction tuning on the LLaMA3.1-8B model, sDCTFT outperforms LoRA with just 0.05M trainable parameters compared to LoRA's 38.2M, and surpasses FourierFT with 30\% less trainable parameters. The source code will be publicly available.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.