LEVI: Generalizable Fine-tuning via Layer-wise Ensemble of Different Views (2402.04644v2)

Published 7 Feb 2024 in cs.LG and cs.AI

Abstract: Fine-tuning is becoming widely used for leveraging the power of pre-trained foundation models in new downstream tasks. While there are many successes of fine-tuning on various tasks, recent studies have observed challenges in the generalization of fine-tuned models to unseen distributions (i.e., out-of-distribution; OOD). To improve OOD generalization, some previous studies identify the limitations of fine-tuning data and regulate fine-tuning to preserve the general representation learned from pre-training data. However, potential limitations in the pre-training data and models are often ignored. In this paper, we contend that overly relying on the pre-trained representation may hinder fine-tuning from learning essential representations for downstream tasks and thus hurt its OOD generalization. It can be especially catastrophic when new tasks are from different (sub)domains compared to pre-training data. To address the issues in both pre-training and fine-tuning data, we propose a novel generalizable fine-tuning method LEVI (Layer-wise Ensemble of different VIews), where the pre-trained model is adaptively ensembled layer-wise with a small task-specific model, while preserving its efficiencies. By combining two complementing models, LEVI effectively suppresses problematic features in both the fine-tuning data and pre-trained model and preserves useful features for new tasks. Broad experiments with large language and vision models show that LEVI greatly improves fine-tuning generalization via emphasizing different views from fine-tuning data and pre-trained features.

PDF Abstract

Summarize Bookmark Chat (Pro)

Authors (11)

Yuji Roh (11 papers)
Qingyun Liu (6 papers)
Huan Gui (11 papers)
Zhe Yuan (75 papers)
Yujin Tang (31 papers)
Steven Euijong Whang (27 papers)
Liang Liu (237 papers)
Shuchao Bi (5 papers)
Lichan Hong (35 papers)
Ed H. Chi (74 papers)
Zhe Zhao (97 papers)

Citations (1)

View on Semantic Scholar

LEVI: Generalizable Fine-tuning via Layer-wise Ensemble of Different Views (2402.04644v2)

Related Papers

Tweets