Large Language Model Instruction Following: A Survey of Progresses and Challenges (2303.10475v8)

Published 18 Mar 2023 in cs.CL

Abstract: Task semantics can be expressed by a set of input-output examples or a piece of textual instruction. Conventional machine learning approaches for NLP mainly rely on the availability of large-scale sets of task-specific examples. Two issues arise: first, collecting task-specific labeled examples does not apply to scenarios where tasks may be too complicated or costly to annotate, or the system is required to handle a new task immediately; second, this is not user-friendly since end-users are probably more willing to provide task description rather than a set of examples before using the system. Therefore, the community is paying increasing interest in a new supervision-seeking paradigm for NLP: learning to follow task instructions, i.e., instruction following. Despite its impressive progress, there are some common issues that the community struggles with. This survey paper tries to summarize and provide insights to the current research on instruction following, particularly, by answering the following questions: (i) What is task instruction, and what instruction types exist? (ii) How to model instructions? (iii) What are popular instruction following datasets and evaluation metrics? (iv) What factors influence and explain the instructions' performance? (v) What challenges remain in instruction following? To our knowledge, this is the first comprehensive survey about instruction following.

References (163)

Authors (3)

Renze Lou (18 papers)
Kai Zhang (542 papers)
Wenpeng Yin (69 papers)

Citations (17)

View on Semantic Scholar

Summary

A Comprehensive Survey on Instruction Following

The paper "A Comprehensive Survey on Instruction Following" presents an extensive exploration of instruction-following paradigms in NLP. This survey focuses on the emerging paradigm of utilizing task instructions over traditional example-based learning, addressing multiple challenges and nuances in this evolving field.

Overview of Instruction Following

The authors categorize instruction following into three primary types: NLI-oriented, LLM-oriented, and Human-oriented instructions. Each type addresses the need for indirect supervision in distinct ways:

NLI-oriented Instructions transform target NLP problems into natural language inference tasks, leveraging indirect supervision from existing NLI datasets.
LLM-oriented Instructions or prompts, focus on converting inputs into a format conducive to the pretrained objectives of LLMs, optimizing for zero-shot and few-shot tasks.
Human-oriented Instructions are complex, user-friendly descriptions used for human-generated data labeling, and pose challenges in encoding and model understanding.

Key Modeling Strategies

For modeling these instructions, the paper outlines various strategies. Semantic parsing converts instructions into logical forms but is limited to specific applications. The flatten-and-concatenation approach lacks efficiency and relies heavily on the scale of training data. HyperNetwork models provide a more structured encoding method by transforming instructions into model parameters. Finally, reinforcement learning from human feedback is recognized for optimizing alignment of LLM outputs with human preferences, albeit at significant cost in human labor.

Datasets and Evaluation

The paper distinguishes between human-annotated and LLM-synthetic datasets for instruction tuning. Human-annotated datasets are high-quality but limited in diversity, while LLM-synthetic datasets offer diversity at the potential cost of accuracy. Evaluation techniques are divided into task-centric and human-centric approaches. Each method presents particular challenges and trade-offs regarding the subjective nature of instruction effectiveness and alignment with human expectations.

Influencing Factors

Performance of instruction following is influenced by several dimensions, including model scale, instruction diversity, and task scale. Interestingly, the synergy between scaling model size and task diversity, referred to as dual-track scaling, highlights the interdependence of these factors in achieving impressive few/zero-shot generalization.

Challenges and Future Directions

Critical challenges remain in negation handling, vulnerability to adversarial instruction attacks, and the interpretability-versus-performance trade-off in instruction engineering. Future research directions include enhancing LLMs' robustness to negated information, defending against adversarial attacks, and balancing human and model alignment in instruction design.

Conclusion

By tracing its development from early machine learning days to the modern advent of LLMs, this survey provides a well-rounded perspective on instruction following. As the first comprehensive survey of its kind, it offers valuable insights and guidance for researchers aiming to push the boundaries of cross-task generalization through instruction-tuned systems. The outlined challenges and potential research directions suggest a promising pathway for future advancements in AI-driven NLP solutions.

PDF Markdown

GitHub

GitHub - RenzeLou/awesome-instruction-learning: Papers and Datasets on Instruction Learning / Instruction Tuning. ✨✨✨ (417 stars)