Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 82 tok/s
Gemini 2.5 Pro 47 tok/s Pro
GPT-5 Medium 14 tok/s Pro
GPT-5 High 16 tok/s Pro
GPT-4o 117 tok/s Pro
Kimi K2 200 tok/s Pro
GPT OSS 120B 469 tok/s Pro
Claude Sonnet 4 36 tok/s Pro
2000 character limit reached

Becoming self-instruct: introducing early stopping criteria for minimal instruct tuning (2307.03692v1)

Published 5 Jul 2023 in cs.CL and cs.AI

Abstract: In this paper, we introduce the Instruction Following Score (IFS), a metric that detects LLMs' ability to follow instructions. The metric has a dual purpose. First, IFS can be used to distinguish between base and instruct models. We benchmark publicly available base and instruct models, and show that the ratio of well formatted responses to partial and full sentences can be an effective measure between those two model classes. Secondly, the metric can be used as an early stopping criteria for instruct tuning. We compute IFS for Supervised Fine-Tuning (SFT) of 7B and 13B LLaMA models, showing that models learn to follow instructions relatively early in the training process, and the further finetuning can result in changes in the underlying base model semantics. As an example of semantics change we show the objectivity of model predictions, as defined by an auxiliary metric ObjecQA. We show that in this particular case, semantic changes are the steepest when the IFS tends to plateau. We hope that decomposing instruct tuning into IFS and semantic factors starts a new trend in better controllable instruct tuning and opens possibilities for designing minimal instruct interfaces querying foundation models.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (28)
  1. “Stanford Alpaca: An Instruction-following LLaMA model” In GitHub repository GitHub, https://github.com/tatsu-lab/stanford_alpaca, 2023
  2. “Self-Instruct: Aligning Language Models with Self-Generated Instructions”, 2023 arXiv:2212.10560 [cs.CL]
  3. “The Flan Collection: Designing Data and Methods for Effective Instruction Tuning”, 2023 arXiv:2301.13688 [cs.AI]
  4. “LIMA: Less Is More for Alignment”, 2023 arXiv:2305.11206 [cs.CL]
  5. “GPT4All: Training an Assistant-style Chatbot with Large Scale Data Distillation from GPT-3.5-Turbo” In GitHub repository GitHub, https://github.com/nomic-ai/gpt4all, 2023
  6. “LLaMA: Open and Efficient Foundation Language Models”, 2023 arXiv:2302.13971 [cs.CL]
  7. “OPT: Open Pre-trained Transformer Language Models”, 2022 arXiv:2205.01068 [cs.CL]
  8. “The Pile: An 800GB Dataset of Diverse Text for Language Modeling” In arXiv preprint arXiv:2101.00027, 2020
  9. Writer “Palmyra LLMs empower secure, enterprise-grade generative AI for business. Writer Blog”, 2023 URL: https://writer.com/blog/palmyra/
  10. “The False Promise of Imitating Proprietary LLMs”, 2023 arXiv:2305.15717 [cs.CL]
  11. OpenAI “ChatGPT: Optimizing language models for dialogue.”, 2022 URL: https://online-chatgpt.com/
  12. Sundar Pichai “An important next step on our AI journey. Google AI Blog”, 2023 URL: https://blog.google/intl/en-africa/products/explore-get-answers/an-important-next-step-on-our-ai-journey/
  13. AnthropicAI “Introducing Claude”, 2023 URL: https://www.anthropic.com/index/introducing-claude
  14. Geoffrey Hinton, Oriol Vinyals and Jeff Dean “Distilling the Knowledge in a Neural Network”, 2015 arXiv:1503.02531 [stat.ML]
  15. “Holistic Evaluation of Language Models”, 2022 arXiv:2211.09110 [cs.CL]
  16. “Natural Questions: a Benchmark for Question Answering Research” In Transactions of the Association of Computational Linguistics, 2019
  17. Huggingface “Open LLM Leaderboard” Accessed: 2023-06-10, 2023 URL: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard
  18. “A framework for few-shot language model evaluation” Zenodo, 2021 DOI: 10.5281/zenodo.5371628
  19. “Maybe Only 0.5% Data is Needed: A Preliminary Exploration of Low Training Data Instruction Tuning”, 2023 arXiv:2305.09246 [cs.AI]
  20. “Fine-Tuning can Distort Pretrained Features and Underperform Out-of-Distribution”, 2022 arXiv:2202.10054 [cs.LG]
  21. “Language Models are Few-Shot Learners”, 2020 arXiv:2005.14165 [cs.CL]
  22. “Language Models are Unsupervised Multitask Learners”, 2018 URL: https://d4mucfpksywv.cloudfront.net/better-language-models/language-models.pdf
  23. “Training language models to follow instructions with human feedback”, 2022 arXiv:2203.02155 [cs.CL]
  24. “Proximal Policy Optimization Algorithms”, 2017 arXiv:1707.06347 [cs.LG]
  25. “Measuring Massive Multitask Language Understanding”, 2021 arXiv:2009.03300 [cs.CY]
  26. “OpenAssistant Conversations – Democratizing Large Language Model Alignment”, 2023 arXiv:2304.07327 [cs.CL]
  27. Huggingface “AutoTrain: Create powerful AI models without code”, 2023 URL: https://huggingface.co/autotrain
  28. “Emergent Abilities of Large Language Models”, 2022 arXiv:2206.07682 [cs.CL]
Citations (13)
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

  • The paper’s main contribution is the development of the Instruction Following Score (IFS) to measure how well models adhere to instructions during fine-tuning.
  • It employs empirical evaluations on models like LLaMA, Palmyra, and GPT variants, demonstrating that instruct-tuned models achieve higher IFS values and improved partial instruction handling.
  • The study shows that using IFS as an early stopping criterion minimizes unwanted semantic shifts, preserving the core knowledge of base models.

Instruction Following and Semantic Shifts in LLMs: A Study of the Instruction Following Score

The research paper "Becoming self-instruct: introducing early stopping criteria for minimal instruct tuning" explores the tuning of LLMs by introducing a novel metric termed the Instruction Following Score (IFS). This score evaluates a model's ability to adhere to provided instructions, distinguishing between base models and instruction-following models. This distinction is crucial because the capability to follow instructions is pivotal for models expected to function as conversational agents.

Introduction of the Instruction Following Score

The authors propose the IFS, calculated as the ratio of "answer-like" responses to "continuation-like" responses, to assess a model's instructional alignment. This metric can quantitatively differentiate models that adhere to instructions from those that do not. Through empirical evaluations on publicly available models, it is evident that IFS can effectively identify models with different levels of instruction adherence. This differentiation is particularly essential in understanding the extent to which models fine-tuned on instruction data exhibit conversational behaviors compared to their base counterparts.

Instruction Following Score in Model Evaluation

The authors assessed a variety of models, including the LLaMA series, Palmyra, and GPT variants, demonstrating that instruct-tuned models generally possess higher IFS values. Notably, such models also displayed a discernible difference in handling partial versus complete instructions, highlighting their fine-tuned alignment with instruction-following objectives. This research adds a new layer to model evaluation, where IFS can serve as an early stopping criterion in the fine-tuning process. This stopping point assists in preventing semantic drifts, which can alter the foundational understanding of base models.

Semantic Implications and ObjecQA

To expand on the semantic impact of instruction tuning, the paper introduces ObjecQA, an auxiliary metric assessing the deviation in model objectivity. ObjecQA evaluates responses to subjective inquiries, offering insight into whether fine-tuning induces undesirable biases or alters the semantic nature of responses. Interestingly, results indicate that semantic shifts, assessed through ObjecQA, intensify post the plateauing of IFS. This suggests distinct phases in fine-tuning: an initial format-infusion phase followed by a knowledge-infusion phase.

This delineation between instruction tone acquisition and semantic shifts offers a methodological advantage for controlling the fine-tuning process. By understanding these phases, researchers can tailor tuning processes to minimize alterations to the base model's knowledge while enhancing instruction-following capabilities.

Implications and Future Directions

The implications of this paper are significant. The IFS provides an early intervention point to limit unwanted semantic alterations while achieving desirable instruction-following behavior. This decoupling of format learning from semantic changes allows the development of task-specific instruct models without degrading foundational model competencies. Future research could explore the composability of instruction-tuned features, enabling more nuanced control over aspects like helpfulness, formality, and informativeness in model responses.

Overall, the paper contributes valuable insights into the fine-tuning of LLMs for instruction tasks, laying the groundwork for targeted improvements in model design and training efficiency. For the broader field of AI, this research highlights the importance of evaluating and controlling the consequences of instruction tuning, ensuring that models maintain alignment with both instructional objectives and underlying knowledge integrity.

Youtube Logo Streamline Icon: https://streamlinehq.com