- The paper presents a framework that extracts executable clinical guidance trees from over 5000 medical documents, forming the MedDM dataset.
- The paper employs image recognition and NLP techniques to convert clinical flowcharts into LLM-interpretable decision nodes.
- The paper demonstrates that integrating its CGT structure with LLMs improves personalized medical advice through iterative decision-making dialogues.
The paper presents a novel framework designed to address two key challenges in utilizing LLMs for medical diagnosis. One challenge is that current medical LLMs display low specialization, meaning that they function more like medical Q&A systems, asking about symptoms and providing general responses, rather than offering specific medical advice based on comprehensive analysis. The second challenge is the lack of suitable datasets for creating executable clinical guidance trees (CGTs), which could enhance the decision-making capabilities of LLMs in clinical settings.
To tackle these issues, the researchers propose a method to extract CGTs from published medical literature and pilot an approach to create an LLM-friendly decision-making dataset known as MedDM. The CGTs are constructed from flowcharts found in clinical practice guidelines, embodying the decision-making processes used by physicians in disease diagnostics or treatment pathways.
The MedDM dataset was created by collecting and analyzing over 5000 pieces of medical literature, leading to the identification and refinement of 1202 images into structured decision trees. These decision trees cover 12 hospital departments and include knowledge on over 500 diseases. This dataset is processed using advanced image recognition and natural language processing techniques to ensure accuracy in representing the nodes and relationships between flowchart components.
A significant contribution of this work is the creation of an LLM-executable CGT structure, which is specifically designed to be interpretable by LLMs. The nodes of this tree are represented using natural language, making it easier to convert into prompts that an LLM can use to generate responses. To ensure the models can use the CGTs effectively, the authors also devise a CGT inference engine for iterative decision-making. This engine is capable of interacting with patients, asking questions, and making decisions based on the information collected, much like a medical professional would in a consultation.
The authors establish that integrating their CGT structure with LLMs leads to models capable of producing more precise and personalized medical advice during multi-turn dialogues with patients. By augmenting LLMs with this structured clinical decision-making knowledge, the models can ask relevant follow-up questions to gather additional patient information, leading to more accurate disease identification and treatment recommendations.
Lastly, while the research team has made significant strides with MedDM and the new CGT framework, they also indicate possible limitations and future directions. These include updating and expanding the medical literature pool to keep abreast of advancements in medical knowledge and refining the framework to cover a broader range of medical conditions.