Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Adaptive Contrastive Decoding in Retrieval-Augmented Generation for Handling Noisy Contexts (2408.01084v2)

Published 2 Aug 2024 in cs.CL

Abstract: When using LLMs in knowledge-intensive tasks, such as open-domain question answering, external context can bridge the gap between external knowledge and the LLMs' parametric knowledge. Recent research has been developed to amplify contextual knowledge over the parametric knowledge of LLMs with contrastive decoding approaches. While these approaches could yield truthful responses when relevant context is provided, they are prone to vulnerabilities when faced with noisy contexts. We extend the scope of previous studies to encompass noisy contexts and propose adaptive contrastive decoding (ACD) to leverage contextual influence effectively. ACD demonstrates improvements in open-domain question answering tasks compared to baselines, especially in robustness by remaining undistracted by noisy contexts in retrieval-augmented generation.

Adaptive Contrastive Decoding in Retrieval-Augmented Generation for Handling Noisy Contexts

Summary

The paper "Adaptive Contrastive Decoding in Retrieval-Augmented Generation for Handling Noisy Contexts" addresses the limitations of LLMs when dealing with knowledge-intensive tasks, such as open-domain question answering (QA), particularly in the presence of noisy contexts. This research introduces adaptive contrastive decoding (ACD) to improve the robustness of LLMs in retrieval-augmented generation (RAG) by effectively managing the influence of noisy contexts.

Background and Motivations

LLMs, such as those based on the architectures presented by Touvron et al. (2023) and Achiam et al. (2023), have shown significant advances in a multitude of benchmarks. Nonetheless, they struggle in knowledge-intensive tasks that require generalization beyond their parametric knowledge. One common approach to mitigate this limitation involves fine-tuning the models, which is computationally intense and scales poorly with the increased size of the LLMs.

To dynamically integrate current and accurate external knowledge without explicit re-training, researchers have explored strategies combining non-parametric knowledge with LLMs during generation. Specifically, contrastive decoding has been employed to enhance contextual influence. While these methods improve response accuracy given precise contexts, their efficacy significantly drops in the presence of noisy or irrelevant contexts. This demonstrates a need for a mechanism that can dynamically adjust the contextual influence during decoding.

Methodology

Problem Formulation

The research focuses on the open-domain QA task within the RAG framework. Given a question qq and a retrieved context cc, a pretrained LLM generates a response based on parametric knowledge. In this setup, the logit vectors for token predictions with and without context are ztcRV\mathbf{z}_{t}^c \in \mathbb{R}^{|V|} and ztRV\mathbf{z}_{t} \in \mathbb{R}^{|V|}, respectively.

Contrastive Decoding

ACD aims to mitigate the adverse effects of noisy contexts by dynamically adjusting the weight of contextual influence (α\alpha) based on the entropy of the model's predictions. The probability distribution for the next token YtY_{t} is modified to:

Pθ(Yt  x,y<t)=softmax(zt+α(ztczt))P_{\theta}(Y_{t}\ |\ x, y_{<t}) = \text{softmax}(\mathbf{z}_{t} + \alpha (\mathbf{z}_{t}^c - \mathbf{z}_{t}))

Adaptive Weight on Contextual Influence

The entropy H(Yt)H(Y_{t}) and H(Ytc)H(Y_{t}^c), representing the model's uncertainty without and with context, respectively, are used to gauge context informativeness. The weight αACD\alpha_{ACD} is determined by the proportion of uncertainty reduction attributable to the context:

αACD=H(Yt)H(Yt)+H(Ytc)\alpha_{ACD} = \frac{H(Y_{t})}{H(Y_{t}) + H(Y_{t}^c)}

This formulation ensures that context significantly reducing uncertainty receives higher weight, while noisy context receives lower influence.

Experimental Results

Experiments were conducted on three open-domain QA datasets: TriviaQA, Natural Questions (NQ), and PopQA. The paper compared ACD against baselines such as regular greedy decoding, CAD, and MICD. The results consistently showed that ACD outperforms these methods across all datasets and LLMs, particularly in handling noisy contexts, demonstrating its robustness.

For example, in the Known-noisy subset where the model's parametric knowledge is correct but the context is noisy, ACD achieved higher EM scores—76.72%, 88.79%, and 54.58% for NQ, TriviaQA, and PopQA, respectively, using the Llama2-7B model. In another scenario, the Unknown-gold subset, where the model lacks the correct answer but the context is accurate, ACD maintained competitive performance, underscoring its balanced approach in context integration.

Analysis

ACD's effectiveness stems from its adaptive weighting mechanism, as confirmed through analysis of the correlation between adaptive weights and context noisiness for different models. ACD exhibited a higher Area Under the ROC Curve (AUROC) compared to MICD across multiple datasets, suggesting better handling of unreliable context. Further case studies illustrated ACD adjusting context influence dynamically based on real-time uncertainty.

Implications and Future Work

ACD's advancements have practical implications for improving the reliability of LLMs in real-world applications where context quality varies. The methodology could be extended to instruction-following models or tasks requiring long-form generation, necessitating further research into handling partial relevance in contexts. Future developments could also explore integrating additional context verification mechanisms to bolster robustness further.

Conclusion

This research has introduced an innovative approach for managing noisy contexts in RAG, leveraging entropy-based adaptive contrastive decoding. The robust performance of ACD across various datasets and models highlights its potential in enhancing the reliability and accuracy of retrieval-augmented LLMs without the need for extensive re-training, paving the way for more dependable AI applications in knowledge-intensive domains.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Youna Kim (7 papers)
  2. Hyuhng Joon Kim (10 papers)
  3. Cheonbok Park (20 papers)
  4. Choonghyun Park (6 papers)
  5. Hyunsoo Cho (28 papers)
  6. Junyeob Kim (7 papers)
  7. Kang Min Yoo (40 papers)
  8. Sang-goo Lee (40 papers)
  9. Taeuk Kim (38 papers)
Citations (1)
X Twitter Logo Streamline Icon: https://streamlinehq.com