Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Zero-shot Generative Large Language Models for Systematic Review Screening Automation (2401.06320v2)

Published 12 Jan 2024 in cs.IR and cs.CL

Abstract: Systematic reviews are crucial for evidence-based medicine as they comprehensively analyse published research findings on specific questions. Conducting such reviews is often resource- and time-intensive, especially in the screening phase, where abstracts of publications are assessed for inclusion in a review. This study investigates the effectiveness of using zero-shot LLMs~(LLMs) for automatic screening. We evaluate the effectiveness of eight different LLMs and investigate a calibration technique that uses a predefined recall threshold to determine whether a publication should be included in a systematic review. Our comprehensive evaluation using five standard test collections shows that instruction fine-tuning plays an important role in screening, that calibration renders LLMs practical for achieving a targeted recall, and that combining both with an ensemble of zero-shot models saves significant screening time compared to state-of-the-art approaches.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Shuai Wang (466 papers)
  2. Harrisen Scells (22 papers)
  3. Shengyao Zhuang (42 papers)
  4. Martin Potthast (64 papers)
  5. Bevan Koopman (37 papers)
  6. Guido Zuccon (73 papers)
Citations (7)
X Twitter Logo Streamline Icon: https://streamlinehq.com