Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Learning to Retrieve In-Context Examples for Large Language Models (2307.07164v2)

Published 14 Jul 2023 in cs.CL and cs.IR
Learning to Retrieve In-Context Examples for Large Language Models

Abstract: LLMs have demonstrated their ability to learn in-context, allowing them to perform various tasks based on a few input-output examples. However, the effectiveness of in-context learning is heavily reliant on the quality of the selected examples. In this paper, we propose a novel framework to iteratively train dense retrievers that can identify high-quality in-context examples for LLMs. Our framework initially trains a reward model based on LLM feedback to evaluate the quality of candidate examples, followed by knowledge distillation to train a bi-encoder based dense retriever. Our experiments on a suite of $30$ tasks demonstrate that our framework significantly enhances in-context learning performance. Furthermore, we show the generalization ability of our framework to unseen tasks during training. An in-depth analysis reveals that our model improves performance by retrieving examples with similar patterns, and the gains are consistent across LLMs of varying sizes. The code and data are available at https://github.com/microsoft/LMOps/tree/main/LLM_retriever .

Overview of "Learning to Retrieve In-Context Examples for LLMs"

The paper "Learning to Retrieve In-Context Examples for LLMs" presents a novel framework, LLM-R, designed to enhance the efficacy of in-context learning (ICL) in LLMs. The primary focus is on improving the selection of in-context examples, a pivotal factor influencing the model's performance on various tasks.

Framework and Methodology

The authors propose an iterative approach to training dense retrievers for identifying high-quality in-context examples. The framework is structured in several key stages:

  1. Initial Training: A reward model is initially trained using feedback from LLMs. The model's role is to evaluate the quality of example candidates, with a particular emphasis on capturing fine-grained feedback signals.
  2. Knowledge Distillation: A bi-encoder based dense retriever is trained through knowledge distillation, leveraging the insights from the reward model. This process involves utilizing soft labels from the reward model instead of simplistic one-hot labels.
  3. Iterative Improvement: By repeatedly retrieving new candidates with the updated retriever, the framework can enhance the quality of selected examples iteratively.

Experimental Results

The framework's performance was tested on a diverse set of 30 NLP tasks. Notable results include an average improvement of 7.8% in in-context learning over random selection, demonstrating the framework's ability to generalize to unseen tasks across different model sizes. The retrieved examples are notably characterized by similar patterns and labels, which contributes to robust performance improvements across tasks.

Analysis and Insights

  • Generalization: LLM-R exhibits strong generalization ability, not only excelling on the included tasks but also maintaining performance on held-out tasks. This underscores the method's adaptability to different LLMs and task types.
  • Task Sensitivity: The experiments reveal that classification tasks with extensive training examples benefit significantly from this framework. Conversely, tasks reliant on inherent LLM capabilities, such as commonsense reasoning, show less variance due to example quality.
  • Iterative Process: The iterative training methodology allows for gradual performance optimization, which stabilizes after a certain number of iterations, indicating convergence.

Implications and Future Directions

The findings illustrate the potential of tailored retrieval methods for enhancing the learning capabilities of LLMs. The framework's capacity for generalization suggests promising applications in domains with limited labeled data, bridging gaps where traditional methods might struggle.

Future research could explore incorporating combinatorial optimization techniques to address the sequential dependencies of in-context examples. Additionally, testing the framework across non-overlapping task categories would offer further validation of its robustness.

In conclusion, this paper contributes a significant advancement in the methodology of example retrieval for LLMs, highlighting the importance of strategic example selection in maximizing the potential of in-context learning paradigms.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Liang Wang (512 papers)
  2. Nan Yang (182 papers)
  3. Furu Wei (291 papers)
Citations (29)
Youtube Logo Streamline Icon: https://streamlinehq.com