Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Multi-Task Training with In-Domain Language Models for Diagnostic Reasoning (2306.04551v2)

Published 7 Jun 2023 in cs.CL and cs.LG

Abstract: Generative AI is a promising direction for augmenting clinical diagnostic decision support and reducing diagnostic errors, a leading contributor to medical errors. To further the development of clinical AI systems, the Diagnostic Reasoning Benchmark (DR.BENCH) was introduced as a comprehensive generative AI framework, comprised of six tasks representing key components in clinical reasoning. We present a comparative analysis of in-domain versus out-of-domain LLMs as well as multi-task versus single task training with a focus on the problem summarization task in DR.BENCH (Gao et al., 2023). We demonstrate that a multi-task, clinically trained LLM outperforms its general domain counterpart by a large margin, establishing a new state-of-the-art performance, with a ROUGE-L score of 28.55. This research underscores the value of domain-specific training for optimizing clinical diagnostic reasoning tasks.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Brihat Sharma (2 papers)
  2. Yanjun Gao (25 papers)
  3. Timothy Miller (27 papers)
  4. Matthew M. Churpek (9 papers)
  5. Majid Afshar (18 papers)
  6. Dmitriy Dligach (16 papers)
Citations (6)