Papers

Topics

Authors

Recent

View all

Detailed Answer

Quick Answer

Concise responses based on abstracts only

Detailed Answer

Well-researched responses based on abstracts and relevant paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses

Gemini 2.5 Flash

Gemini 2.5 Flash 82 tok/s

Gemini 2.5 Pro 47 tok/s Pro

GPT-5 Medium 14 tok/s Pro

GPT-5 High 16 tok/s Pro

GPT-4o 117 tok/s Pro

Kimi K2 200 tok/s Pro

GPT OSS 120B 469 tok/s Pro

Claude Sonnet 4 36 tok/s Pro

2000 character limit reached

Becoming self-instruct: introducing early stopping criteria for minimal instruct tuning (2307.03692v1)

Published 5 Jul 2023 in cs.CL and cs.AI

Abstract: In this paper, we introduce the Instruction Following Score (IFS), a metric that detects LLMs' ability to follow instructions. The metric has a dual purpose. First, IFS can be used to distinguish between base and instruct models. We benchmark publicly available base and instruct models, and show that the ratio of well formatted responses to partial and full sentences can be an effective measure between those two model classes. Secondly, the metric can be used as an early stopping criteria for instruct tuning. We compute IFS for Supervised Fine-Tuning (SFT) of 7B and 13B LLaMA models, showing that models learn to follow instructions relatively early in the training process, and the further finetuning can result in changes in the underlying base model semantics. As an example of semantics change we show the objectivity of model predictions, as defined by an auxiliary metric ObjecQA. We show that in this particular case, semantic changes are the steepest when the IFS tends to plateau. We hope that decomposing instruct tuning into IFS and semantic factors starts a new trend in better controllable instruct tuning and opens possibilities for designing minimal instruct interfaces querying foundation models.

References (28)

Citations (13)

View on Semantic Scholar

Collections

Summary

The paper’s main contribution is the development of the Instruction Following Score (IFS) to measure how well models adhere to instructions during fine-tuning.
It employs empirical evaluations on models like LLaMA, Palmyra, and GPT variants, demonstrating that instruct-tuned models achieve higher IFS values and improved partial instruction handling.
The study shows that using IFS as an early stopping criterion minimizes unwanted semantic shifts, preserving the core knowledge of base models.

Instruction Following and Semantic Shifts in LLMs: A Study of the Instruction Following Score

The research paper "Becoming self-instruct: introducing early stopping criteria for minimal instruct tuning" explores the tuning of LLMs by introducing a novel metric termed the Instruction Following Score (IFS). This score evaluates a model's ability to adhere to provided instructions, distinguishing between base models and instruction-following models. This distinction is crucial because the capability to follow instructions is pivotal for models expected to function as conversational agents.

Introduction of the Instruction Following Score

The authors propose the IFS, calculated as the ratio of "answer-like" responses to "continuation-like" responses, to assess a model's instructional alignment. This metric can quantitatively differentiate models that adhere to instructions from those that do not. Through empirical evaluations on publicly available models, it is evident that IFS can effectively identify models with different levels of instruction adherence. This differentiation is particularly essential in understanding the extent to which models fine-tuned on instruction data exhibit conversational behaviors compared to their base counterparts.

Instruction Following Score in Model Evaluation

The authors assessed a variety of models, including the LLaMA series, Palmyra, and GPT variants, demonstrating that instruct-tuned models generally possess higher IFS values. Notably, such models also displayed a discernible difference in handling partial versus complete instructions, highlighting their fine-tuned alignment with instruction-following objectives. This research adds a new layer to model evaluation, where IFS can serve as an early stopping criterion in the fine-tuning process. This stopping point assists in preventing semantic drifts, which can alter the foundational understanding of base models.

Semantic Implications and ObjecQA

To expand on the semantic impact of instruction tuning, the paper introduces ObjecQA, an auxiliary metric assessing the deviation in model objectivity. ObjecQA evaluates responses to subjective inquiries, offering insight into whether fine-tuning induces undesirable biases or alters the semantic nature of responses. Interestingly, results indicate that semantic shifts, assessed through ObjecQA, intensify post the plateauing of IFS. This suggests distinct phases in fine-tuning: an initial format-infusion phase followed by a knowledge-infusion phase.

This delineation between instruction tone acquisition and semantic shifts offers a methodological advantage for controlling the fine-tuning process. By understanding these phases, researchers can tailor tuning processes to minimize alterations to the base model's knowledge while enhancing instruction-following capabilities.

Implications and Future Directions

The implications of this paper are significant. The IFS provides an early intervention point to limit unwanted semantic alterations while achieving desirable instruction-following behavior. This decoupling of format learning from semantic changes allows the development of task-specific instruct models without degrading foundational model competencies. Future research could explore the composability of instruction-tuned features, enabling more nuanced control over aspects like helpfulness, formality, and informativeness in model responses.

Overall, the paper contributes valuable insights into the fine-tuning of LLMs for instruction tasks, laying the groundwork for targeted improvements in model design and training efficiency. For the broader field of AI, this research highlights the importance of evaluating and controlling the consequences of instruction tuning, ensuring that models maintain alignment with both instructional objectives and underlying knowledge integrity.