Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 83 tok/s
Gemini 2.5 Pro 54 tok/s Pro
GPT-5 Medium 21 tok/s Pro
GPT-5 High 20 tok/s Pro
GPT-4o 103 tok/s Pro
Kimi K2 205 tok/s Pro
GPT OSS 120B 456 tok/s Pro
Claude Sonnet 4 35 tok/s Pro
2000 character limit reached

Reverse Physician-AI Relationship: Full-process Clinical Diagnosis Driven by a Large Language Model (2508.10492v1)

Published 14 Aug 2025 in cs.AI, cs.CE, and cs.CL

Abstract: Full-process clinical diagnosis in the real world encompasses the entire diagnostic workflow that begins with only an ambiguous chief complaint. While AI, particularly LLMs, is transforming clinical diagnosis, its role remains largely as an assistant to physicians. This AI-assisted working pattern makes AI can only answer specific medical questions at certain parts within the diagnostic process, but lack the ability to drive the entire diagnostic process starting from an ambiguous complaint, which still relies heavily on human physicians. This gap limits AI's ability to fully reduce physicians' workload and enhance diagnostic efficiency. To address this, we propose a paradigm shift that reverses the relationship between physicians and AI: repositioning AI as the primary director, with physicians serving as its assistants. So we present DxDirector-7B, an LLM endowed with advanced deep thinking capabilities, enabling it to drive the full-process diagnosis with minimal physician involvement. Furthermore, DxDirector-7B establishes a robust accountability framework for misdiagnoses, delineating responsibility between AI and human physicians. In evaluations across rare, complex, and real-world cases under full-process diagnosis setting, DxDirector-7B not only achieves significant superior diagnostic accuracy but also substantially reduces physician workload than state-of-the-art medical LLMs as well as general-purpose LLMs. Fine-grained analyses across multiple clinical departments and tasks validate its efficacy, with expert evaluations indicating its potential to serve as a viable substitute for medical specialists. These findings mark a new era where AI, traditionally a physicians' assistant, now drives the entire diagnostic process to drastically reduce physicians' workload, indicating an efficient and accurate diagnostic solution.

Summary

  • The paper introduces DxDirector-7B as a novel LLM that directs the diagnostic process with minimal physician input.
  • It employs a three-stage training methodology including continued pre-training, instruction-tuning, and step-level strategy optimization for refined clinical reasoning.
  • Experimental results show superior diagnostic accuracy and parameter efficiency over existing medically-adapted and commercial LLMs.

Reverse Physician-AI Relationship: Full-process Clinical Diagnosis Driven by a LLM

The paper "Reverse Physician-AI Relationship: Full-process Clinical Diagnosis Driven by a LLM" (2508.10492) presents a novel approach to clinical diagnostics using a LLM, DxDirector-7B. This model aims to shift the conventional roles in clinical diagnostics, promoting the AI to the role of director while relegating human physicians to assistants. This marks a fundamental change from the existing paradigm where AI merely serves as an adjunct to human decision-making. Through advanced "slow thinking" capabilities, DxDirector-7B is designed to guide the entire diagnostic workflow effectively, beginning from an ambiguous patient complaint to a determined diagnosis, thus minimizing physician involvement and aiming to significantly reduce their workload. Figure 1

Figure 1: Workflow of full-process diagnosis in past, now and the new era. The inner circle represents the directorship (deciding the specific clinical problems) of multi-step dynamic clinical diagnosis, and the outer circle represents the specific execution of the corresponding steps (solving the clinical problems).

Methodology and System Architecture

The paper introduces DxDirector-7B, a LLM constructed with 7 billion parameters, specifically optimized for medical diagnostics. The authors describe their training method for the model, incorporating three stages: continued pre-training on medical data, instruction-tuning for full-process diagnosis starting from vague chief complaints, and step-level strategy preference optimization.

Stage 1: Continued Pre-training

DxDirector-7B is initialized from Llama-2-7B and subjected to an extensive continued pre-training on a corpus comprising medical literature, including clinical guidelines, PubMed abstracts, and full papers. The model is trained to acquire essential medical knowledge through a cross-entropy loss function, based on the next token prediction framework, which enables the model to learn from the text data itself.

Stage 2: Instruction-Tuning

The instruction-tuning phase enables DxDirector-7B to emulate human "slow thinking" and execute the full-process clinical diagnosis. The training dataset was constructed using MedQA, comprising 10,178 high-quality instruction-response pairs. Researchers utilize GPT-4o and o1-preview for data transformation and deep thinking injection. Specific prompts guided model training, resulting in a comprehensive dataset representing stepwise reasoning with clear delineation of LLM and human physician roles. Each step involves a structured process of formulating diagnostic queries and generating corresponding responses, thereby training DxDirector-7B to perform deep, nuanced clinical reasoning and proactive physician interaction when necessary. Figure 2

Figure 2: Comparison between our DxDirector-7B and existing LLMs in full-process clinic diagnosis.

Stage 3: Step-Level Strategy Preference Optimization

The authors introduce Step-Level Strategy Preference Optimization—a reinforcement learning technique to ensure that at each step of diagnosis, DxDirector-7B selects optimal strategies. They construct datasets comprising multiple reasoning paths, varying in rewards based on final diagnosis accuracy and physician involvement. The model learns to prefer higher-reward strategies, leveraging the Direct Preference Optimization objective. This approach ensures that the model not only attains superior diagnostic accuracy with minimal assistance but also establishes a robust accountability mechanism. Figure 3

Figure 3: A case of DxDirector-7B performing the full-process diagnosis starting with only a chief complaint.

Experimental Evaluation

The authors systematically evaluate DxDirector-7B's performance against both medically adapted and powerful commercial LLMs across a range of datasets including NEJM Clinicopathologic Cases, RareArena, and ClinicalBench. The results consistently show that DxDirector-7B outperforms competitors in diagnostic accuracy despite having significantly fewer parameters, demonstrating marked parameter efficiency. In Fig. 4, DxDirector-7B achieves superior diagnostic accuracy across rare disease and real-world datasets, significantly reducing the necessary physician involvement relative to baseline LLMs. Figure 4

Figure 4

Figure 4

Figure 4: Rare Disease Cases (RareArena).

Figure 5

Figure 5

Figure 5

Figure 5: A comparative heatmap analysis of diagnostic accuracy: DxDirector-7B vs. state-of-the-art medically adapted and commercial general-purpose LLMs across 17 clinical departments consisting of 1,500 samples in ClinicalBench.

Implications and Future Work

DxDirector-7B represents a significant evolution in the intersection of AI and clinical medicine, demonstrating that LLMs can assume a directing role in diagnosis with minimal human intervention. The model's capacity to autonomously guide diagnosis from a vague complaint significantly reduces physician workload, offering scalable diagnostic solutions particularly beneficial in resource-constrained settings. Despite its successes, areas for further exploration exist, such as refining physician involvement rules and integrating specialized AI models for pathology to fully realize DxDirector-7B's potential.

Conclusion

Through rigorous experimentation and methodological innovation, DxDirector-7B challenges and transforms existing paradigms in medical diagnostics. By repositioning AI not as an assistant but as the primary director, this work paves the way for more efficient, accurate, and scalable diagnostic methodologies. Further research and refinement are expected to augment DxDirector-7B's capabilities, enhancing its applicability across diverse clinical contexts worldwide.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube