Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
173 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Dial-insight: Fine-tuning Large Language Models with High-Quality Domain-Specific Data Preventing Capability Collapse (2403.09167v1)

Published 14 Mar 2024 in cs.CL

Abstract: The efficacy of LLMs is heavily dependent on the quality of the underlying data, particularly within specialized domains. A common challenge when fine-tuning LLMs for domain-specific applications is the potential degradation of the model's generalization capabilities. To address these issues, we propose a two-stage approach for the construction of production prompts designed to yield high-quality data. This method involves the generation of a diverse array of prompts that encompass a broad spectrum of tasks and exhibit a rich variety of expressions. Furthermore, we introduce a cost-effective, multi-dimensional quality assessment framework to ensure the integrity of the generated labeling data. Utilizing a dataset comprised of service provider and customer interactions from the real estate sector, we demonstrate a positive correlation between data quality and model performance. Notably, our findings indicate that the domain-specific proficiency of general LLMs can be enhanced through fine-tuning with data produced via our proposed method, without compromising their overall generalization abilities, even when exclusively domain-specific data is employed for fine-tuning.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (22)
  1. Gpt-4 technical report. arXiv preprint arXiv:2303.08774.
  2. Yi: Open foundation models by 01.ai.
  3. Qwen technical report. arXiv preprint arXiv:2309.16609.
  4. Shai: A large language model for asset management.
  5. Reinforcement learning from statistical feedback: the journey from ab testing to ant testing.
  6. Measuring massive multitask language understanding. Cornell University - arXiv,Cornell University - arXiv.
  7. C-eval: A multi-level multi-discipline chinese evaluation suite for foundation models. Advances in Neural Information Processing Systems, 36.
  8. Sajed Jalil. 2023. The transformative influence of large language models on software development. arXiv preprint arXiv:2311.16429.
  9. Mixtral of experts. arXiv preprint arXiv:2401.04088.
  10. Cmmlu: Measuring massive multitask language understanding in chinese. arXiv preprint arXiv:2306.09212.
  11. From quantity to quality: Boosting llm performance with self-guided data selection for instruction tuning. arXiv preprint arXiv:2308.12032.
  12. Csds: A fine-grained chinese dataset for customer service dialogue summarization. arXiv preprint arXiv:2108.13139.
  13. Training language models to follow instructions with human feedback. Advances in neural information processing systems, 35:27730–27744.
  14. Wanjuan-cc: A safe and high-quality open-sourced english webtext dataset. arXiv preprint arXiv:2402.19282.
  15. Exploring the limits of transfer learning with a unified text-to-text transformer. arXiv: Learning.
  16. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288.
  17. Self-instruct: Aligning language model with self generated instructions.
  18. Chathome: Development and evaluation of a domain-specific language model for home renovation. arXiv preprint arXiv:2307.15290.
  19. Wizardlm: Empowering large language models to follow complex instructions.
  20. Glm-130b: An open bilingual pre-trained model.
  21. When scaling meets llm finetuning: The effect of data, model and finetuning method. arXiv preprint arXiv:2402.17193.
  22. A survey of large language models. arXiv preprint arXiv:2303.18223.
Citations (2)

Summary

We haven't generated a summary for this paper yet.