Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Fine-tuning Enhanced RAG System with Quantized Influence Measure as AI Judge (2402.17081v1)

Published 26 Feb 2024 in cs.IR

Abstract: This study presents an innovative enhancement to retrieval-augmented generation (RAG) systems by seamlessly integrating fine-tuned LLMs with vector databases. This integration capitalizes on the combined strengths of structured data retrieval and the nuanced comprehension provided by advanced LLMs. Central to our approach are the LoRA and QLoRA methodologies, which stand at the forefront of model refinement through parameter-efficient fine-tuning and memory optimization. A novel feature of our research is the incorporation of user feedback directly into the training process, ensuring the model's continuous adaptation to user expectations and thus, improving its performance and applicability. Additionally, we introduce a Quantized Influence Measure (QIM) as an innovative "AI Judge" mechanism to enhance the precision of result selection, further refining the system's accuracy. Accompanied by an executive diagram and a detailed algorithm for fine-tuning QLoRA, our work provides a comprehensive framework for implementing these advancements within chatbot technologies. This research contributes significant insights into LLM optimization for specific uses and heralds new directions for further development in retrieval-augmented models. Through extensive experimentation and analysis, our findings lay a robust foundation for future advancements in chatbot technology and retrieval systems, marking a significant step forward in the creation of more sophisticated, precise, and user-centric conversational AI systems.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (70)
  1. Short-term outcomes for youth receiving runaway and homeless shelter services. Research on Social Work Practice, 12(5):589–603, 2002.
  2. From crisis to housing: A comparison of select homeless shelters from across the united states. Journal of Poverty, pages 1–18, 2022.
  3. Homeless near a thousand homes: Outcomes of homeless youth in a crisis shelter. American Journal of Orthopsychiatry, 75(3):347–355, 2005.
  4. Adjustment of homeless adolescents to a crisis shelter: Application of a stress and coping model. Journal of Youth and Adolescence, 31:79–89, 2002.
  5. Martha R Burt. Helping America’s homeless: Emergency shelter or affordable housing? The Urban Insitute, 2001.
  6. Benard P Dreyer. A shelter is not a home: The crisis of family homelessness in the united states. Pediatrics, 142(5), 2018.
  7. Sheltering risks: Implementation of harm reduction in homeless shelters during an overdose emergency. International Journal of Drug Policy, 53:83–89, 2018.
  8. Shelters for the homeless: Learning from research. Hulchanski, JD, Campsie, P., Chau, Hwang, S. and Paradis, E.(eds), Finding Home: Policy Options for Addressing Homelessness in Canada, revised edn. Toronto: Cities Centre, University of Toronto, pages 1–24, 2009.
  9. Fernanda Santos. Elderly and homeless: America’s next housing crisis. New York Times Magazine. Available at: https://www. nytimes. com/2020/09/30/magazine/homeless-seniors-elderly. html, 2020.
  10. “if you’re gonna help me, help me”: Barriers to housing among unsheltered homeless adults. Evaluation and Program Planning, 76:101673, 2019.
  11. Changing attitudes toward the homeless: The effects of prosocial communication with the homeless. Journal of social distress and the homeless, 9:91–110, 2000.
  12. Waiting for shelter: Perspectives on a homeless shelter’s procedures. Journal of Community Psychology, 45(7):846–858, 2017.
  13. Understanding transitions in care from hospital to homeless shelter: a mixed-methods, community-based participatory approach. Journal of general internal medicine, 27:1484–1491, 2012.
  14. Disparities in communication among the inpatient homeless population at a safety-net hospital. Journal of the National Medical Association, 113(4):440–448, 2021.
  15. Robert L Barker. At home with the homeless: An experience in transcultural communication. Journal of Independent Social Work, 4(4):61–73, 1990.
  16. Impacting quality of life at a homeless shelter: Measuring the effectiveness of say it straight. International Journal of Interdisciplinary Social Sciences, 5(12), 2011.
  17. Olusola Olufemi. Barriers that disconnect homeless people and make homelessness difficult to interpret. Development Southern Africa, 19(4):455–466, 2002.
  18. Examining communication for homeless populations in times of crises. Natural Hazards Review, 24(3):05023003, 2023.
  19. Large language models as zero-shot conversational recommenders. In Proceedings of the 32nd ACM international conference on information and knowledge management, pages 720–730, 2023.
  20. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
  21. Llms4ol: Large language models for ontology learning. In International Semantic Web Conference, pages 408–427. Springer, 2023.
  22. Amy Winograd. Loose-lipped large language models spill your secrets: The privacy implications of large language models. Harvard Journal of Law & Technology, 36(2), 2023.
  23. Fingpt: Open-source financial large language models. arXiv preprint arXiv:2306.06031, 2023.
  24. Large language models in uro-oncology. European Urology Oncology, 7(1):157–159, 2024.
  25. Sinan Ozdemir. Quick Start Guide to Large Language Models: Strategies and Best Practices for Using ChatGPT and Other LLMs. Addison-Wesley Professional, 2023.
  26. An improved transformer-based model for detecting phishing, spam, and ham: A large language model approach. arXiv preprint arXiv:2311.04913, 2023.
  27. Integrating graphs with large language models: Methods and prospects. arXiv preprint arXiv:2310.05499, 2023.
  28. Large-language-models (llm)-based ai chatbots: Architecture, in-depth analysis and their performance evaluation. In International Conference on Recent Trends in Image Processing and Pattern Recognition, pages 237–249. Springer, 2023.
  29. An empirical study on usage and perceptions of llms in a software engineering project. arXiv preprint arXiv:2401.16186, 2024.
  30. Guiding llm to fool itself: Automatically manipulating machine reading comprehension shortcut triggers. arXiv preprint arXiv:2310.18360, 2023.
  31. Efficient detection of llm-generated texts with a bayesian surrogate model. arXiv preprint arXiv:2305.16617, 2023.
  32. Openagi: When llm meets domain experts. arXiv preprint arXiv:2304.04370, 2023.
  33. To repeat or not to repeat: Insights from scaling llm under token-crisis. arXiv preprint arXiv:2305.13230, 2023.
  34. The poison of alignment. arXiv preprint arXiv:2308.13449, 2023.
  35. Qlora: Efficient finetuning of quantized llms. arXiv preprint arXiv:2305.14314, 2023.
  36. Loftq: Lora-fine-tuning-aware quantization for large language models. arXiv preprint arXiv:2310.08659, 2023.
  37. Quantized side tuning: Fast and memory-efficient tuning of quantized large language models. arXiv preprint arXiv:2401.07159, 2024.
  38. L4q: Parameter efficient quantization-aware training on large language models via lora-wise lsq. arXiv preprint arXiv:2402.04902, 2024.
  39. Modulora: Finetuning 3-bit llms on consumer gpus by integrating with modular quantizers. arXiv preprint arXiv:2309.16119, 2023.
  40. Machine translation with large language models: Prompting, few-shot learning, and fine-tuning with qlora. In Proceedings of the Eighth Conference on Machine Translation, pages 468–481, 2023.
  41. Qa-lora: Quantization-aware low-rank adaptation of large language models. arXiv preprint arXiv:2309.14717, 2023.
  42. Lq-lora: Low-rank plus quantized matrix decomposition for efficient language model finetuning. arXiv preprint arXiv:2311.12023, 2023.
  43. Lmtuner: An user-friendly and highly-integrable training framework for fine-tuning large language models. arXiv preprint arXiv:2308.10252, 2023.
  44. Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in Neural Information Processing Systems, 33:9459–9474, 2020.
  45. Generation-augmented retrieval for open-domain question answering. arXiv preprint arXiv:2009.08553, 2020.
  46. Recent advances in retrieval-augmented text generation. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 3417–3419, 2022.
  47. Retrieval-augmented generation for code summarization via hybrid gnn. arXiv preprint arXiv:2006.05405, 2020.
  48. Retrieval-augmented generation for large language models: A survey. arXiv preprint arXiv:2312.10997, 2023.
  49. Active retrieval augmented generation. arXiv preprint arXiv:2305.06983, 2023.
  50. Retrieval-augmented controllable review generation. In Proceedings of the 28th International Conference on Computational Linguistics, pages 2284–2295, 2020.
  51. Benchmarking large language models in retrieval-augmented generation. arXiv preprint arXiv:2309.01431, 2023.
  52. A survey on retrieval-augmented text generation. arXiv preprint arXiv:2202.01110, 2022.
  53. Retrieval-augmented reinforcement learning. In International Conference on Machine Learning, pages 7740–7765. PMLR, 2022.
  54. Retrieval-augmented diffusion models. Advances in Neural Information Processing Systems, 35:15309–15324, 2022.
  55. Improving the domain adaptation of retrieval augmented generation (rag) models for open domain question answering. Transactions of the Association for Computational Linguistics, 11:1–17, 2023.
  56. Retrieval-augmented multilingual keyphrase generation with retriever-generator iterative training. arXiv preprint arXiv:2205.10471, 2022.
  57. Retrieval augmentation of large language models for lay language generation. Journal of Biomedical Informatics, 149:104580, 2024.
  58. Discovering influential variables: A method of partitions. The Annals of Applied Statistics, 3(4):1335 – 1369, 2009.
  59. S.H. Lo and T. Zheng. Backward haplotype transmission association algorithm - a fast multiple-marker screening method. Hum. Hered., 53(4):197–215, 2002.
  60. An interaction-based convolutional neural network (icnn) toward a better understanding of covid-19 x-ray images. Algorithms, 14(11):337, 2021.
  61. A novel interaction-based methodology towards explainable ai with better understanding of pneumonia chest x-ray images. Discover Artificial Intelligence, 1(1):16, 2021.
  62. Language semantics interpretation with an interaction-based recurrent neural network. Machine Learning and Knowledge Extraction, 3(4):922–945, 2021.
  63. Detecting mild cognitive impairment and dementia in older adults using naturalistic driving data and interaction-based classification from influence score. Artificial Intelligence in Medicine, 138:102510, 2023.
  64. Why significant variables aren’t automatically good predictors. Proceedings of the National Academy of Sciences, 112(45):13892–13897, 2015.
  65. Framework for making better predictions by directly estimating variables’ predictivity. Proceedings of the National Academy of Sciences, 113(50):14277–14282, 2016.
  66. Intrinsic dimensionality explains the effectiveness of language model fine-tuning. arXiv preprint arXiv:2012.13255, 2020.
  67. Efficientdm: Efficient quantization-aware fine-tuning of low-bit diffusion models. arXiv preprint arXiv:2310.03270, 2023.
  68. Amelie Schreiber. Esmbind and qbind: Lora, qlora, and esm-2 for predicting binding sites and post translational modification. bioRxiv, pages 2023–11, 2023.
  69. Delta-lora: Fine-tuning high-rank parameters with the delta of low-rank matrices. arXiv preprint arXiv:2309.02411, 2023.
  70. Chain of lora: Efficient fine-tuning of language models via residual learning. arXiv preprint arXiv:2401.04151, 2024.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Keshav Rangan (1 paper)
  2. Yiqiao Yin (7 papers)
Citations (5)