Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Evaluating the Effectiveness of Retrieval-Augmented Large Language Models in Scientific Document Reasoning (2311.04348v1)

Published 7 Nov 2023 in cs.CL and cs.AI

Abstract: Despite the dramatic progress in LLM development, LLMs often provide seemingly plausible but not factual information, often referred to as hallucinations. Retrieval-augmented LLMs provide a non-parametric approach to solve these issues by retrieving relevant information from external data sources and augment the training process. These models help to trace evidence from an externally provided knowledge base allowing the model predictions to be better interpreted and verified. In this work, we critically evaluate these models in their ability to perform in scientific document reasoning tasks. To this end, we tuned multiple such model variants with science-focused instructions and evaluated them on a scientific document reasoning benchmark for the usefulness of the retrieved document passages. Our findings suggest that models justify predictions in science tasks with fabricated evidence and leveraging scientific corpus as pretraining data does not alleviate the risk of evidence fabrication.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Sai Munikoti (25 papers)
  2. Anurag Acharya (13 papers)
  3. Sridevi Wagle (7 papers)
  4. Sameera Horawalavithana (18 papers)
Citations (4)

Summary

We haven't generated a summary for this paper yet.