Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Aggregated Knowledge Model: Enhancing Domain-Specific QA with Fine-Tuned and Retrieval-Augmented Generation Models (2410.18344v1)

Published 24 Oct 2024 in cs.CL, cs.AI, and cs.LG

Abstract: This paper introduces a novel approach to enhancing closed-domain Question Answering (QA) systems, focusing on the specific needs of the Lawrence Berkeley National Laboratory (LBL) Science Information Technology (ScienceIT) domain. Utilizing a rich dataset derived from the ScienceIT documentation, our study embarks on a detailed comparison of two fine-tuned LLMs and five retrieval-augmented generation (RAG) models. Through data processing techniques, we transform the documentation into structured context-question-answer triples, leveraging the latest LLMs (AWS Bedrock, GCP PaLM2, Meta LLaMA2, OpenAI GPT-4, Google Gemini-Pro) for data-driven insights. Additionally, we introduce the Aggregated Knowledge Model (AKM), which synthesizes responses from the seven models mentioned above using K-means clustering to select the most representative answers. The evaluation of these models across multiple metrics offers a comprehensive look into their effectiveness and suitability for the LBL ScienceIT environment. The results demonstrate the potential benefits of integrating fine-tuning and retrieval-augmented strategies, highlighting significant performance improvements achieved with the AKM. The insights gained from this study can be applied to develop specialized QA systems tailored to specific domains.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (24)
  1. Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023).
  2. COBERT: COVID-19 question answering system using BERT. Arabian journal for science and engineering (2021), 1–11.
  3. Gemini: a family of highly capable multimodal models. arXiv preprint arXiv:2312.11805 (2023).
  4. AWS. 2023. AWS Bedrock. https://aws.amazon.com/bedrock/. Accessed: 2023-12-19.
  5. Hiteshwar Kumar Azad and Akshay Deepak. 2019. Query expansion techniques for information retrieval: a survey. Information Processing & Management 56, 5 (2019), 1698–1735.
  6. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).
  7. Kawin Ethayarajh. 2019. How contextual are contextualized word representations? Comparing the geometry of BERT, ELMo, and GPT-2 embeddings. arXiv preprint arXiv:1909.00512 (2019).
  8. Automating ischemic stroke subtype classification using machine learning and natural language processing. Journal of Stroke and Cerebrovascular Diseases 28, 7 (2019), 2045–2051.
  9. Google. 2023. Tune Language Foundation Models Vertex AI Google Cloud. https://cloud.google.com/vertex-ai/docs/generative-ai/models/tune-models. Accessed: 2023-12-30.
  10. Baseball: an automatic question-answerer. In Papers presented at the May 9-11, 1961, western joint IRE-AIEE-ACM computer conference. 219–224.
  11. Retrieval augmented language model pre-training. In International conference on machine learning. PMLR, 3929–3938.
  12. Greg Hamerly and Charles Elkan. 2003. Learning the k in k-means. Advances in neural information processing systems 16 (2003).
  13. John A Hartigan and Manchek A Wong. 1979. Algorithm AS 136: A k-means clustering algorithm. Journal of the royal statistical society. series c (applied statistics) 28, 1 (1979), 100–108.
  14. Adaptable closed-domain question answering using contextualized CNN-attention models and question expansion. IEEE Access 10 (2022), 45080–45092.
  15. A novel CNN-based method for question classification in intelligent question answering. In Proceedings of the 2018 international conference on algorithms, computing and artificial intelligence. 1–6.
  16. Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in Neural Information Processing Systems 33 (2020), 9459–9474.
  17. Towards end-to-end multilingual question answering. Information Systems Frontiers 23 (2021), 227–241.
  18. Matt Post. 2018. A call for clarity in reporting BLEU scores. arXiv preprint arXiv:1804.08771 (2018).
  19. Juan Ramos et al. 2003. Using tf-idf to determine word relevance in document queries. In Proceedings of the first instructional conference on machine learning, Vol. 242. Citeseer, 29–48.
  20. Llama 2: Open foundation and fine-tuned chat models. CoRR, abs/2307.09288, 2023b. doi: 10.48550. arXiv preprint arXiv.2307.09288 (2023).
  21. Attention is all you need. Advances in neural information processing systems 30 (2017).
  22. Recogs: How incidental details of a logical form overshadow an evaluation of semantic interpretation. arXiv preprint arXiv:2303.13716 (2023).
  23. Novel architecture for long short-term memory used in question classification. Neurocomputing 299 (2018), 20–31.
  24. DFM: A parameter-shared deep fused model for knowledge base question answering. Information Sciences 547 (2021), 103–118.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Fengchen Liu (1 paper)
  2. Jordan Jung (1 paper)
  3. Wei Feinstein (1 paper)
  4. Jeff DAmbrogia (1 paper)
  5. Gary Jung (1 paper)

Summary

We haven't generated a summary for this paper yet.