Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
51 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

SelectLLM: Can LLMs Select Important Instructions to Annotate? (2401.16553v7)

Published 29 Jan 2024 in cs.CL and cs.AI

Abstract: Instruction tuning benefits from large and diverse datasets; however, creating such datasets involves a high cost of human labeling. While synthetic datasets generated by LLMs have partly solved this issue, they often contain low-quality data. One effective solution is selectively annotating unlabelled instructions, especially given the relative ease of acquiring unlabeled instructions or texts from various sources. However, how to select unlabelled instructions is not well-explored, especially in the context of LLMs. Therefore, we introduce SelectLLM, an alternative framework that leverages the capabilities of LLMs to select unlabeled instructions more effectively. Specifically, SelectLLM consists of two key steps: Coreset-based clustering of unlabelled instructions for enlarging diversity and prompting of LLM to identify the most beneficial instructions within each cluster. We evaluate SelectLLM on AlpacaEval2 and MT-Bench, demonstrating its ability to outperform state-of-the-art methods like Alpagasus. In addition, we compare the performance and compatibility of SelectLLM with various LLMs, such as ChatGPT, LLaMA-3.1-70B, and Gemma-2-27b. SelectLLM's adaptability and robustness are further evidenced by its ability to maintain high performance across both human and synthetic datasets. All code and data are publicly available (https://github.com/minnesotanlp/select-LLM).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (28)
  1. Instruction mining: When data mining meets large language model finetuning.
  2. Joel Luis Carbonera and Mara Abel. 2015. A density-based approach for instance selection. In 2015 IEEE 27th International Conference on Tools with Artificial Intelligence (ICTAI), pages 768–774. IEEE.
  3. Maybe only 0.5% data is needed: A preliminary exploration of low training data instruction tuning. arXiv preprint arXiv:2305.09246.
  4. Alpagasus: Training a better alpaca with fewer data.
  5. Scaling instruction-finetuned language models. arXiv preprint arXiv:2210.11416.
  6. Free dolly: Introducing the world’s first truly open instruction-tuned llm.
  7. Qlora: Efficient finetuning of quantized llms.
  8. John A Hartigan and Manchek A Wong. 1979. Algorithm as 136: A k-means clustering algorithm. Journal of the royal statistical society. series c (applied statistics), 28(1):100–108.
  9. Exploring the benefits of training expert language models over instruction tuning.
  10. Large language models are zero-shot reasoners. In Advances in Neural Information Processing Systems (NeurIPS).
  11. Active instruction tuning: Improving cross-task generalization by training on prompt sensitive tasks.
  12. Symbolic chain-of-thought distillation: Small models can also" think" step-by-step. In Annual Meeting of the Association for Computational Linguistics (ACL).
  13. Gpteval: Nlg evaluation using gpt-4 with better human alignment. arXiv preprint arXiv:2303.16634.
  14. When less is more: Investigating data pruning for pretraining llms at scale.
  15. Language models are unsupervised multitask learners. OpenAI blog, 1(8):9.
  16. Nils Reimers and Iryna Gurevych. 2019. Sentence-bert: Sentence embeddings using siamese bert-networks. In Conference on Empirical Methods in Natural Language Processing (EMNLP).
  17. Multitask prompted training enables zero-shot task generalization.
  18. Ozan Sener and Silvio Savarese. 2018. Active learning for convolutional neural networks: A core-set approach.
  19. Burr Settles. 2009. Active learning literature survey.(2009).
  20. One embedder, any task: Instruction-finetuned text embeddings. In Annual Meeting of the Association for Computational Linguistics (ACL).
  21. Is chatgpt good at search? investigating large language models as re-ranking agent. In Conference on Empirical Methods in Natural Language Processing (EMNLP).
  22. Stanford alpaca: An instruction-following llama model. https://github.com/tatsu-lab/stanford_alpaca.
  23. Llama 2: Open foundation and fine-tuned chat models.
  24. Self-instruct: Aligning language model with self generated instructions.
  25. Self-instruct: Aligning language model with self generated instructions. In Annual Meeting of the Association for Computational Linguistics (ACL).
  26. Super-naturalinstructions:generalization via declarative instructions on 1600+ tasks. In EMNLP.
  27. Finetuned language models are zero-shot learners.
  28. Lima: Less is more for alignment.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Ritik Sachin Parkar (2 papers)
  2. Jaehyung Kim (44 papers)
  3. Jong Inn Park (4 papers)
  4. Dongyeop Kang (72 papers)
Citations (6)
X Twitter Logo Streamline Icon: https://streamlinehq.com