Fine-Tuning or Retrieval? Comparing Knowledge Injection in LLMs (2312.05934v3)

Published 10 Dec 2023 in cs.AI, cs.CL, and cs.LG

Abstract: LLMs encapsulate a vast amount of factual information within their pre-trained weights, as evidenced by their ability to answer diverse questions across different domains. However, this knowledge is inherently limited, relying heavily on the characteristics of the training data. Consequently, using external datasets to incorporate new information or refine the capabilities of LLMs on previously seen information poses a significant challenge. In this study, we compare two common approaches: unsupervised fine-tuning and retrieval-augmented generation (RAG). We evaluate both approaches on a variety of knowledge-intensive tasks across different topics. Our findings reveal that while unsupervised fine-tuning offers some improvement, RAG consistently outperforms it, both for existing knowledge encountered during training and entirely new knowledge. Moreover, we find that LLMs struggle to learn new factual information through unsupervised fine-tuning, and that exposing them to numerous variations of the same fact during training could alleviate this problem.

PDF HTML Abstract

This paper investigates two prevalent methods for knowledge injection into LLMs: unsupervised fine-tuning and retrieval-augmented generation (RAG). Given the static and non-specific nature of an LLM's knowledge base derived from pre-training, the paper examines how these methods enhance domain-specific knowledge and update the factual information base of the models.

Purpose of Knowledge Injection

Knowledge injection is critical to improving the domain-specific expertise and factual accuracy of LLMs. The authors underscore the importance of distinguishing between previously encountered knowledge and entirely new facts. The former refers to facts within the model's training data, while the latter involves new information that the model has not been exposed to during training.

Methodologies Compared

Unsupervised Fine-Tuning:
- This method adapts a pre-trained model using additional task-specific data without the use of labeled data. While it improves the model’s performance, this technique shows limited efficacy in learning new information.
- One problem identified is that LLMs have difficulty assimilating new factual knowledge through unsupervised fine-tuning unless they encounter numerous variations of the same fact during training.
Retrieval-Augmented Generation (RAG):
- RAG integrates retrieval mechanisms that allow an LLM to access external knowledge bases dynamically, thereby enhancing the model’s ability to incorporate new information that wasn't present in the original training data.
- The RAG method consistently outperforms fine-tuning, particularly in injecting new facts into the model. It avoids issues like catastrophic forgetting, where the model loses previously learned information due to the adaptation process.

Key Findings

Performance: RAG outstrips fine-tuning in terms of improving LLM’s knowledge, regardless of whether the information was previously encountered during training or entirely new.
Reliability: RAG is more robust in updating LLMs’ knowledge bases without degrading other model capabilities.
Challenges in Fine-Tuning: Unsupervised fine-tuning showed limited improvements and was less reliable for incorporating new factual information.

Future Directions

The paper highlights potential areas for further investigation:

Optimization of RAG: The variability in the number of documents to retrieve in RAG suggests a need for more efficient strategies in selecting relevant information.
Combined Techniques: The exploration of hybrid knowledge injection techniques, including supervised and reinforcement learning, could provide more comprehensive solutions.
Knowledge Representation: Further studies are needed to understand how LLMs internally represent knowledge, which could advance future improvements in knowledge injection methods.

By providing these insights, the paper makes significant contributions to understanding how to better inject knowledge into LLMs, thereby enhancing their functionality and adaptability for various domain-specific applications.