Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Internet-augmented language models through few-shot prompting for open-domain question answering (2203.05115v2)

Published 10 Mar 2022 in cs.CL and cs.LG

Abstract: In this work, we aim to capitalize on the unique few-shot capabilities of large-scale LLMs (LSLMs) to overcome some of their challenges with respect to grounding to factual and up-to-date information. Motivated by semi-parametric LLMs (LMs), which ground their decisions in external retrieved evidence, we use few-shot prompting to learn to condition LMs on information returned from the web using Google Search, a broad and constantly updated knowledge source. Our approach does not involve fine-tuning or learning additional parameters, thus making it applicable to any LM, offering therefore a strong baseline. Indeed, we find that LMs conditioned on the web surpass performance of closed-book models of similar, or even larger, model sizes in open-domain question answering. Finally, we find that increasing the inference-time compute of models, achieved via using multiple retrieved evidences to generate multiple answers followed by a reranking stage that uses scores generated by the same LMs, leads to better performance and alleviates lower performance of smaller few-shot LMs. All in all, our findings suggest that it might be beneficial to slow down the race towards the biggest model and instead shift attention towards finding more effective ways to use models, including but not limited to, better prompting or increasing inference-time compute.

Internet-Augmented LLMs Through Few-Shot Prompting for Open-Domain Question Answering: An Evaluation

The paper "Internet-augmented LLMs through few-shot prompting for open-domain question answering" explores the efficacy of integrating large-scale LLMs (LSLMs) with external, web-based sources to enhance their performance in open-domain question answering tasks. The proposed methodology leverages few-shot prompting, a technique driven by the inherent capabilities of LSLMs, to condition these models on information sourced directly from the Internet using Google Search. This approach circumvents the need for fine-tuning or parameter adjustments, thereby establishing a robust baseline method applicable to a variety of LLMs.

Methodology and Findings

The authors embark on enhancing the factual grounding and recency of LSLMs by employing search engines as a dynamic, ever-updating repository of information. The strategy is rooted in the framework of semi-parametric models, which ground model predictions on retrieved evidence to mitigate issues such as hallucinations. Specifically, the paper presents a system wherein user queries are treated as search engine inputs, retrieving relevant web documents which are then used to construct few-shot prompts for the LMs.

Empirical results underscore the superiority of internet-conditioned models over traditional closed-book models, particularly in open-domain question answering. For instance, the proposed approach exhibited an impressive relative performance boost of 15%-30% over conventional few-shot LSLMs on generation tasks, highlighting the tangible gains achievable through external evidence conditioning. The paper also reveals that even for multi-hop questions—which inherently pose higher retrieval errors—the method achieves performance improvements, albeit smaller.

Additionally, the research examines the effect of increasing inference compute time by reranking multiple answers generated from numerous retrieved documents. This technique involves sampling several responses followed by a reranking process, effectively bridging the performance gap between smaller few-shot models and their larger counterparts. Hence, the focus slightly shifts from merely expanding model size to optimizing operational strategies such as effective prompting and leveraging increased inference-time compute.

Theoretical and Practical Implications

The theoretical implications underline the potential to decelerate the trend towards ever-expanding models by concentrating efforts on optimizing how existing models utilize their few-shot capabilities. Moreover, the research illustrates the prospect of integrating search engines for cognitive tasks, which not only widens the breadth of topics and viewpoints models can access but also introduces challenges associated with uncurated and disparate online content.

Practically, this approach sets a new precedent in using internet-sourced information for real-time applications requiring factual and current data, as showcased in the SituatedQA experiments assessing models' adaptability to post-training world events. Though challenges like potential misinformation and diversified web content remain, the proposed method innovates a lightweight, scalable alternative to current intensive training paradigms.

Future Direction

Future work could extend into refining retrieval mechanisms for complex queries or effectively aligning parametric and non-parametric knowledge sources. Exploring advanced methods for contextual and prompt optimization could further boost both generalization and fidelitous grounding in evolving knowledge landscapes. A pivotal aspect of future deployments will be balancing robust retrieval plugin models, like commercial search engines, against resilience to misinformation, underscoring an ongoing need for innovation in AI safety and interpretability.

In conclusion, the paper's insights chart a course beyond inching towards larger models, advocating instead for strategic integrations of computational methodologies with real-world, dynamic data sources. Through advancing internet-conditioning practices, it sets a groundwork for LSLMs to engage more effectively with the decentralized, modern informational ecosystem.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Angeliki Lazaridou (34 papers)
  2. Elena Gribovskaya (9 papers)
  3. Wojciech Stokowiec (11 papers)
  4. Nikolai Grigorev (2 papers)
Citations (115)
Youtube Logo Streamline Icon: https://streamlinehq.com