Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

LingYi: Medical Conversational Question Answering System based on Multi-modal Knowledge Graphs (2204.09220v1)

Published 20 Apr 2022 in cs.CL and cs.AI

Abstract: The medical conversational system can relieve the burden of doctors and improve the efficiency of healthcare, especially during the pandemic. This paper presents a medical conversational question answering (CQA) system based on the multi-modal knowledge graph, namely "LingYi", which is designed as a pipeline framework to maintain high flexibility. Our system utilizes automated medical procedures including medical triage, consultation, image-text drug recommendation and record. To conduct knowledge-grounded dialogues with patients, we first construct a Chinese Medical Multi-Modal Knowledge Graph (CM3KG) and collect a large-scale Chinese Medical CQA (CMCQA) dataset. Compared with the other existing medical question-answering systems, our system adopts several state-of-the-art technologies including medical entity disambiguation and medical dialogue generation, which is more friendly to provide medical services to patients. In addition, we have open-sourced our codes which contain back-end models and front-end web pages at https://github.com/WENGSYX/LingYi. The datasets including CM3KG at https://github.com/WENGSYX/CM3KG and CMCQA at https://github.com/WENGSYX/CMCQA are also released to further promote future research.

Citations (10)

Summary

  • The paper introduces LingYi, a novel system using a Chinese multi-modal knowledge graph that achieved 90.9% accuracy in entity recognition during triage.
  • LingYi employs a dynamic symptom selection algorithm and Central Records Memory to optimize diagnostic questioning and consultation efficiency.
  • The system leverages pre-trained models and prompt learning to generate contextually accurate dialogue responses and image-text drug recommendations.

Overview of "LingYi: Medical Conversational Question Answering System based on Multi-modal Knowledge Graphs"

The paper, "LingYi: Medical Conversational Question Answering System based on Multi-modal Knowledge Graphs," presents a novel approach to medical conversational systems using a flexible, pipelined framework integrating multi-modal knowledge graphs. This system, referred to as "LingYi," is proposed as a solution to enhance efficiency in healthcare delivery by supporting automated medical procedures such as triage, consultation, and drug recommendation.

System Highlights

LingYi is built around the innovative use of a Chinese Medical Multi-Modal Knowledge Graph (CM3KG) and a comprehensive Chinese Medical Conversational Question Answering (CMCQA) dataset. The framework involves several key phases—before, during, and after diagnosis—each equipped with distinct modules:

  1. Before Diagnosis: This phase involves medical triage and the identification and resolution of medical entities. The paper highlights the implementation of state-of-the-art methodologies for entity recognition and disambiguation, achieving an accuracy of 90.9% in standard datasets.
  2. Consultation Stage: The system utilizes a dynamic symptom selection algorithm designed to optimize diagnostic questioning, minimizing redundancy while ensuring comprehensive symptom capture. This stage leverages Central Records Memory (CRM) to store, update, and reason with patient data, enabling informed dialogue generation.
  3. Post-Diagnosis: The system provides image-text drug recommendations, leveraging the multi-modal aspect of CM3KG to aid patient comprehension and decision-making. Additionally, it generates medical records summarizing the conversation and consultation for subsequent patient visits.

Technical Achievements and Evaluation

The LingYi system integrates advanced techniques such as prompt learning for dialogue response generation, ensuring both the relevance and accuracy of interactions. The use of pre-trained models and prompt learning allows the system to generate detailed and contextually pertinent responses, indicative of the strong performance metrics reported—F1, BLEU, and Distinct scores demonstrate superiority over existing approaches.

Extensive evaluations have been conducted, including comparisons in entity disambiguation tasks and dialogue generation assessments through both automatic metrics and human evaluations, which suggest notable advancements in response fluency and knowledge correctness. Human evaluators, on average, scored LingYi's outputs higher in fluency compared to existing systems, indicating its practical utility.

Implications and Future Directions

LingYi's capacity to deliver automated medical consultations has significant implications for healthcare, particularly in contexts where resources are constrained, such as during the COVID-19 pandemic. The availability of its datasets and models promises to stimulate further research and technological advances in medical AI solutions.

Future work may focus on expanding the multi-modal capabilities of the knowledge graph, enhancing the depth of medical reasoning, and integrating further privacy-preserving techniques such as federated learning to protect patient data while facilitating ongoing improvements in system accuracy and reliability.

This system's deployment not only envisages reducing the burden on healthcare providers but also aims at offering scalable, efficient patient interaction solutions across diverse healthcare settings. By open-sourcing the models and datasets, the paper lays groundwork for ensuing developments and collaborations among researchers dedicated to advancing medical AI technologies.