Massive Supervised Fine-tuning Experiments Reveal How Data, Layer, and Training Factors Shape LLM Alignment Quality (2506.14681v1)

Published 17 Jun 2025 in cs.CL

Abstract: Supervised fine-tuning (SFT) is a critical step in aligning LLMs with human instructions and values, yet many aspects of SFT remain poorly understood. We trained a wide range of base models on a variety of datasets including code generation, mathematical reasoning, and general-domain tasks, resulting in 1,000+ SFT models under controlled conditions. We then identified the dataset properties that matter most and examined the layer-wise modifications introduced by SFT. Our findings reveal that some training-task synergies persist across all models while others vary substantially, emphasizing the importance of model-specific strategies. Moreover, we demonstrate that perplexity consistently predicts SFT effectiveness--often surpassing superficial similarity between trained data and benchmark--and that mid-layer weight changes correlate most strongly with performance gains. We will release these 1,000+ SFT models and benchmark results to accelerate further research.

Authors (6)

Yuto Harada (2 papers)
Yusuke Yamauchi (4 papers)
Yusuke Oda (15 papers)
Yohei Oseki (22 papers)
Yusuke Miyao (35 papers)
Yu Takagi (11 papers)

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Tweets

https://twitter.com/ariwaramiya/status/1935277851613479310

https://twitter.com/arxivsanitybot/status/1935334048613773323

Massive Supervised Fine-tuning Experiments Reveal How Data, Layer, and Training Factors Shape LLM Alignment Quality (2506.14681v1)

Summary

Related Papers

Tweets