Fine Tuning LLM for Enterprise: Practical Guidelines and Recommendations (2404.10779v1)
Abstract: There is a compelling necessity from enterprises for fine tuning LLMs o get them trained on proprietary domain knowledge. The challenge is to imbibe the LLMs with domain specific knowledge using the most optimial resource and cost and in the best possible time. Many enterprises rely on RAG (Retrieval Augmented Generation) which does not need LLMs to be ine-tuned but they are limited by the quality of vector databases and their retrieval capabilities rather than the intrinsic capabilities of the LLMs themselves. In our current work we focus on fine tuning LLaMA, an open source LLM using proprietary documents and code from an enterprise repository and use the fine tuned models to evaluate the quality of responses. As part of this work, we aim to guide beginners on how to start with fine tuning an LLM for documentation and code by making educated guesses on size of GPU required and options that are available for formatting the data. We also propose pre processing recipes for both documentation and code to prepare dataset in different formats. The proposed methods of data preparation for document datasets are forming paragraph chunks, forming question and answer pairs and forming keyword and paragraph chunk pairs. For code dataset we propose forming summary and function pairs. Further, we qualitatively evaluate the results of the models for domain specific queries. Finally, we also propose practical guidelines and recommendations for fine tuning LLMs.
- A paradigm shift in machine translation: Boosting translation performance of large language models, 2024.
- Weaver: Foundation models for creative writing, 2024.
- Building emotional support chatbots in the era of llms, 2023.
- Llama 2: Open foundation and fine-tuned chat models, 2023.
- Fingpt: Democratizing internet-scale data for financial large language models, 2023.
- Pmc-llama: Towards building open-source language models for medicine, 2023.
- A review of current trends, techniques, and challenges in large language models (llms). Applied Sciences, 14(5), 2024.
- Cheonsu Jeong. A study on the implementation of generative ai services using an enterprise data-based llm application architecture. Advances in Artificial Intelligence and Machine Learning, 03(04):1588–1618, 2023.
- The power of noise: Redefining retrieval for rag systems, 2024.
- Rag vs fine-tuning: Pipelines, tradeoffs, and a case study on agriculture, 2024.
- Lora: Low-rank adaptation of large language models, 2021.
- Llm.int8(): 8-bit matrix multiplication for transformers at scale, 2022.
- Qlora: Efficient finetuning of quantized llms. In A. Oh, T. Neumann, A. Globerson, K. Saenko, M. Hardt, and S. Levine, editors, Advances in Neural Information Processing Systems, volume 36, pages 10088–10115. Curran Associates, Inc., 2023.
- Mark Horowitz. 1.1 computing’s energy problem (and what we can do about it). In 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC), pages 10–14, Feb 2014.
- A study of bfloat16 for deep learning training, 2019.
- Quantization and training of neural networks for efficient integer-arithmetic-only inference, 2017.
- Memory Decreases! But Latency Increases…., howpublished = https://github.com/timdettmers/bitsandbytes/issues/6 .
- Instruction tuning for large language models: A survey, 2023.
- Rapid Automatic Keyword Extraction algorithm domain independent keyword extraction algorithm which tries to determine key phrases in a body of text by analyzing the frequency of word appearance and its co-occurance with other words in the text. https://pypi.org/project/rake-nltk/.
- Querying local documents, powered by LLM. https://github.com/snexus/llm-search/blob/main/src/llmsearch/parsers/doc.py.
- Textbooks are all you need, 2023.
- Mathav Raj J (1 paper)
- Kushala VM (1 paper)
- Harikrishna Warrier (5 papers)
- Yogesh Gupta (5 papers)