Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Dancing in Chains: Reconciling Instruction Following and Faithfulness in Language Models (2407.21417v1)

Published 31 Jul 2024 in cs.CL

Abstract: Modern LLMs (LMs) need to follow human instructions while being faithful; yet, they often fail to achieve both. Here, we provide concrete evidence of a trade-off between instruction following (i.e., follow open-ended instructions) and faithfulness (i.e., ground responses in given context) when training LMs with these objectives. For instance, fine-tuning LLaMA-7B on instruction following datasets renders it less faithful. Conversely, instruction-tuned Vicuna-7B shows degraded performance at following instructions when further optimized on tasks that require contextual grounding. One common remedy is multi-task learning (MTL) with data mixing, yet it remains far from achieving a synergic outcome. We propose a simple yet effective method that relies on Rejection Sampling for Continued Self-instruction Tuning (ReSet), which significantly outperforms vanilla MTL. Surprisingly, we find that less is more, as training ReSet with high-quality, yet substantially smaller data (three-fold less) yields superior results. Our findings offer a better understanding of objective discrepancies in alignment training of LMs.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (50)
  1. Ms marco: A human generated machine reading comprehension dataset. In Advances in Neural Information Processing Systems (NeurIPS).
  2. Satanjeev Banerjee and Alon Lavie. 2005. METEOR: An automatic metric for MT evaluation with improved correlation with human judgments. In Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization.
  3. Felm: Benchmarking factuality evaluation of large language models. In Advances in Neural Information Processing Systems (NeurIPS) Datasets and Benchmarks Track.
  4. Cheng-Han Chiang and Hung-yi Lee. 2023. Can large language models be an alternative to human evaluations? In Association for Computational Linguistics (ACL).
  5. Vicuna: An open-source chatbot impressing gpt-4 with 90%* chatgpt quality.
  6. Deep reinforcement learning from human preferences. In Advances in Neural Information Processing Systems (NeurIPS).
  7. Scaling instruction-finetuned language models. In arXiv preprint arXiv:2210.11416.
  8. Free dolly: Introducing the world’s first truly open instruction-tuned llm.
  9. Michael Crawshaw. 2020. Multi-task learning with deep neural networks: A survey. In arXiv preprint arXiv:2009.09796.
  10. On the origin of hallucinations in conversational models: Is it the datasets or the models? In North American Chapter of the Association for Computational Linguistics (NAACL).
  11. Understanding dataset difficulty with 𝒱𝒱\mathcal{V}caligraphic_V-usable information. In International Conference on Machine Learning (ICML).
  12. QAFactEval: Improved QA-based factual consistency evaluation for summarization. In North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT).
  13. Koala: A dialogue model for academic research. Blog post.
  14. A closer look at the limitations of instruction tuning. In arXiv preprint arXiv:2402.05119.
  15. RobustQA: Benchmarking the robustness of domain adaptation for open-domain question answering. In Findings of Association for Computational Linguistics (ACL).
  16. Teaching machines to read and comprehend. In Advances in Neural Information Processing Systems (NeurIPS).
  17. A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions. arXiv preprint arXiv:2311.05232.
  18. Atlas: Few-shot learning with retrieval augmented language models. In The Journal of Machine Learning Research (JMLR).
  19. Survey of hallucination in natural language generation. In ACM Computing Surveys.
  20. Language models (mostly) know what they know. In Findings of Association for Computational Linguistics (ACL).
  21. Evaluating open-domain question answering in the era of large language models. In Association for Computational Linguistics (ACL).
  22. DSPy: Compiling declarative language model calls into self-improving pipelines. In International Conference on Learning Representations (ICLR).
  23. Openassistant conversations–democratizing large language model alignment. In Advances in Neural Information Processing Systems (NeurIPS) Datasets and Benchmarks Track.
  24. Natural questions: A benchmark for question answering research. In Transactions of the Association of Computational Linguistics (TACL).
  25. Summac: Re-visiting nli-based models for inconsistency detection in summarization. In Transactions of the Association of Computational Linguistics (TACL).
  26. Rlaif: Scaling reinforcement learning from human feedback with ai feedback. In arXiv preprint arXiv:2309.00267.
  27. Retrieval-augmented generation for knowledge-intensive nlp tasks. In Advances in Neural Information Processing Systems (NeurIPS).
  28. Chin-Yew Lin. 2004. ROUGE: A package for automatic evaluation of summaries. In Text Summarization Branches Out Workshop at Association for Computational Linguistics (ACL).
  29. Inoculation by fine-tuning: A method for analyzing challenge datasets. In North American Chapter of the Association for Computational Linguistics (NAACL).
  30. Generating wikipedia by summarizing long sequences. In International Conference on Learning Representations (ICLR).
  31. Multi-task deep neural networks for natural language understanding. In Association for Computational Linguistics (ACL).
  32. G-eval: NLG evaluation using gpt-4 with better human alignment. In Empirical Methods in Natural Language Processing (EMNLP).
  33. Cross-task generalization via natural language crowdsourcing instructions. In Association for Computational Linguistics (ACL).
  34. Abstractive text summarization using sequence-to-sequence RNNs and beyond. In Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning, pages 280–290, Berlin, Germany. Association for Computational Linguistics.
  35. OpenAI. 2022. Introducing chatgpt. URL https://openai.com/blog/chatgpt.
  36. Training language models to follow instructions with human feedback. In Advances in Neural Information Processing Systems (NeurIPS).
  37. Bleu: a method for automatic evaluation of machine translation. In Association for Computational Linguistics (ACL).
  38. Hindsight: Posterior-guided training of retrievers for improved open-ended generation. In International Conference on Learning Representations (ICLR).
  39. Direct preference optimization: Your language model is secretly a reward model. In Advances in Neural Information Processing Systems (NeurIPS).
  40. Increasing faithfulness in knowledge-grounded dialogue with controllable features. In Association for Computational Linguistics and International Joint Conference on Natural Language Processing (ACL-IJCNLP).
  41. Proximal policy optimization algorithms. In arXiv preprint arXiv:1707.06347.
  42. Stanford alpaca: An instruction-following llama model. https://github.com/tatsu-lab/stanford_alpaca.
  43. Llama 2: Open foundation and fine-tuned chat models. In arXiv preprint arXiv:2307.09288.
  44. Survey on factuality in large language models: Knowledge, retrieval and domain-specificity. In arXiv preprint arXiv:2310.07521.
  45. Self-instruct: Aligning language model with self generated instructions. In Association for Computational Linguistics (ACL).
  46. Super-naturalinstructions: Generalization via declarative instructions on 1600+ nlp tasks. In Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics.
  47. Finetuned language models are zero-shot learners. In International Conference on Learning Representations (ICLR).
  48. Star: Bootstrapping reasoning with reasoning. In Advances in Neural Information Processing Systems (NeurIPS).
  49. Siren’s song in the ai ocean: A survey on hallucination in large language models. In arXiv preprint arXiv:2309.01219.
  50. Judging llm-as-a-judge with mt-bench and chatbot arena. In Advances in Neural Information Processing Systems (NeurIPS) Datasets and Benchmarks Track.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Zhengxuan Wu (37 papers)
  2. Yuhao Zhang (107 papers)
  3. Peng Qi (55 papers)
  4. Yumo Xu (14 papers)
  5. Rujun Han (19 papers)
  6. Yian Zhang (12 papers)
  7. Jifan Chen (12 papers)
  8. Bonan Min (20 papers)
  9. Zhiheng Huang (33 papers)
X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets