Data Selection for Model Alignment
Introduction
The process of improving LLMs often involves instruction tuning, where the model is fine-tuned using specific datasets post-pretraining. A pivotal facet of instruction tuning efficacy is the selection of appropriate data. Although data engineering's significance is acknowledged, a systematic method for identifying optimal instruction tuning data remains undefined. This paper explores data selection techniques, conducting controlled experiments and developing new metrics for appraising data with the aim of enhancing instruction tuning performance.
Measuring Data for Alignment
In the context of data selection, the research measures data qualities through controlled experiments, addressing complexity, quality, and diversity. Data scoring involves evaluating samples across these dimensions, allowing for data selection strategies to be formed based on these scores. Benchmarks are established using multiple datasets, revealing insights into the impact of complexity and quality variance on model performance.
Complexity and Data Selection
Complexity is often associated with better instruction tuning outcomes. Methods such as "Evol-Complexity" were introduced, leveraging LLMs like ChatGPT for scoring complexity by evolving a set of instruction samples and addressing fine-grained complexity differences. The results showed the robustness of this method in diverse datasets, with superior performance in both high and lower-quality settings.
Quality Assessment
Quality in data is crucial, especially when available data pools exhibit considerable variance in sample quality. "Evol-Quality" is developed, prompting ChatGPT to enhance response quality iteratively. This method, like Evol-Complexity, emphasizes the importance of nuanced scoring, showing a consistent improvement in alignment performance, particularly in datasets with a high incidence of low-quality examples.
Diversity of Data
Acknowledging that a proficient LLM should handle diverse requests, a selection strategy is formulated to ensure dataset diversity while maintaining complexity and quality. An iterative embedding-based strategy, "Repr Filter," is proposed, which prioritizes the addition of diverse samples to the training set. This method triumphs over other strategies by contributing to superior model alignment.
Data-Efficient Instruction Tuning (DEITA)
DEITA encompasses models fine-tuned with data carefully selected for complexity, quality, and diversity. The proposed score-first, diversity-aware selection strategy significantly reduces the number of samples required for effective alignment. DEITA models, based on preexisting LLMs, reach or surpass the alignment performance of models trained with much larger datasets, underlining the efficacy of the sophisticated data selection and reduced computational expenditure.
Experimentation and Findings
Estensive experimentation with models based on different LLM architectures demonstrates that DEITA models outperform other instruction tuning models aligned solely with supervision. Even when compared with models undergoing reinforcement learning with human feedback, DEITA shows commendable performance, particularly when direct preference optimization is applied post-supervision.
Conclusion
This work establishes clear methodologies for ascertaining what constitutes "good data" for model alignment through instruction tuning. The creation of DEITA and its associated models is a step towards more data-efficient alignment. These models and selected datasets have been released to further research efforts in model efficiency and productivity.