Empower Large Language Model to Perform Better on Industrial Domain-Specific Question Answering (2305.11541v3)

Published 19 May 2023 in cs.CL and cs.AI

Abstract: LLM has gained popularity and achieved remarkable results in open-domain tasks, but its performance in real industrial domain-specific scenarios is average due to its lack of specific domain knowledge. This issue has attracted widespread attention, but there are few relevant benchmarks available. In this paper, we provide a benchmark Question Answering (QA) dataset named MSQA, centered around Microsoft products and IT technical problems encountered by customers. This dataset contains industry cloud-specific QA knowledge, an area not extensively covered in general LLMs, making it well-suited for evaluating methods aiming to enhance LLMs' domain-specific capabilities. In addition, we propose a new model interaction paradigm that can empower LLM to achieve better performance on domain-specific tasks where it is not proficient. Extensive experiments demonstrate that the approach following our method outperforms the commonly used LLM with retrieval methods. We make our source code and sample data available at: https://aka.ms/Microsoft_QA.

PDF HTML Abstract

Introduction

A recent publication in the field of AI and natural language processing addresses the challenges LLMs face when dealing with domain-specific problems. Despite their vast knowledge and remarkable performance in various open-domain tasks, these models often fall short when it comes to domain-specific question answering (QA) due to their limited pretraining on specialized knowledge. This gap in performance has led to a surge of interest in methods that can fine-tune and improve LLMs' abilities in such contexts.

MSQA Dataset Creation

The researchers introduced a benchmark dataset called MSQA, concentrating on Microsoft products and IT technical issues. The dataset contains 32,000 QA pairs and is designed to test and enhance LLMs' domain-specific abilities. The MSQA highlights an area that is not extensively covered by general LLMs, specifically for evaluating industrial-domain question answering scenarios. The paper also notes the high cost and potential risks associated with data leakage during fine-tuning of LLMs, as access to domain-specific data is often limited and confidential.

Methodology

The proposed approach involves pre-training smaller LLMs on domain documentation to instill domain-specific knowledge. Subsequently, the model is fine-tuned using instruction tuning with an emphasis on QA tasks, leveraging the domain knowledge gained. The fine-tuned domain-specific model then assists the general LLM by providing relevant domain-specific information during runtime. This interaction paradigm circumvents the need for traditional data retrieval methods, making it easier to maintain privacy while staying updated with domain knowledge.

Experiment and Results

Through comprehensive experimentation, the proposed model interaction paradigm demonstrated enhanced performance over traditional retrieval-based methods when measured against standard and new evaluation metrics. The authors also introduced new metrics tailored for long-form QA tasks that align better with human evaluations. Importantly, the method showed significant improvements in generating contextually accurate domain-specific answers. The researchers have made the source code and sample data publicly available to foster further research in empowering LLMs within specific industrial domains.

PDF Markdown Bookmark Chat (Pro)

References (46)

Authors (9)

Fangkai Yang (45 papers)
Pu Zhao (82 papers)
Zezhong Wang (30 papers)
Lu Wang (329 papers)
Jue Zhang (43 papers)
Mohit Garg (15 papers)
Qingwei Lin (81 papers)
Saravan Rajmohan (85 papers)
Dongmei Zhang (193 papers)

Citations (39)

View on Semantic Scholar

Empower Large Language Model to Perform Better on Industrial Domain-Specific Question Answering (2305.11541v3)

Introduction

MSQA Dataset Creation

Methodology

Experiment and Results

Related Papers