Emergent Mind

Abstract

The use of large language models in medical dialogue generation has garnered significant attention, with a focus on improving response quality and fluency. While previous studies have made progress in optimizing model performance for single-round medical Q&A tasks, there is a need to enhance the model's capability for multi-round conversations to avoid logical inconsistencies. To address this, we propose an approach called preference learning from process feedback~(PLPF), which integrates the doctor's diagnostic logic into LLMs. PLPF involves rule modeling, preference data generation, and preference alignment to train the model to adhere to the diagnostic process. Experimental results using Standardized Patient Testing show that PLPF enhances the diagnostic accuracy of the baseline model in medical conversations by 17.6%, outperforming traditional reinforcement learning from human feedback. Additionally, PLPF demonstrates effectiveness in both multi-round and single-round dialogue tasks, showcasing its potential for improving medical dialogue generation.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a detailed summary of this paper with a premium account.

We ran into a problem analyzing this paper.

Please try again later (sorry!).

Get summaries of trending AI papers delivered straight to your inbox

Unsubscribe anytime.

References
  1. Qwen Technical Report
  2. Constitutional AI: Harmlessness from AI Feedback
  3. DISC-MedLLM: Bridging General Large Language Models and Real-World Medical Consultation
  4. HuatuoGPT-II, One-stage Training for Medical Adaption of LLMs
  5. A benchmark for automatic medical consultation system: frameworks, tasks and datasets. Bioinformatics, 39(1):btac817.
  6. Plugmed: Improving specificity in patient-centered medical dialogue generation using in-context learning. In The 2023 Conference on Empirical Methods in Natural Language Processing.
  7. Glm: General language model pretraining with autoregressive blank infilling. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 320–335.
  8. Reinforced Self-Training (ReST) for Language Modeling
  9. Applying deep matching networks to chinese medical question answering: a study and a dataset. BMC medical informatics and decision making, 19(2):91–100.
  10. A survey of large language models for healthcare: from data, technology, and applications to accountability and ethics
  11. Understanding the Effects of RLHF on LLM Generalisation and Diversity
  12. Rlaif: Scaling reinforcement learning from human feedback with ai feedback
  13. Meddg: An entity-centric medical consultation dataset for entity-aware medical dialogue generation
  14. OpenAI. 2023. ChatGPT: A Large-Scale Open-Domain Chatbot. https://openai.com/chatgpt. Version 2.0.

  15. Direct preference optimization: Your language model is secretly a reward model
  16. SALMON: Self-Alignment with Instructable Reward Models
  17. ClinicalGPT: Large Language Models Finetuned with Diverse Medical Data and Comprehensive Evaluation
  18. Huatuo: Tuning llama model with chinese medical knowledge
  19. JunYong Zhu WeiGuo Dong. 2012. Objective Structured Clinical Examination & Standardized Patients. People’s Medical Publishing House (PMPH), No. 19, Panjiayuan Nanli, Chaoyang District, Beijing, China.
  20. DoctorGLM: Fine-tuning your Chinese Doctor is not a Herculean Task
  21. Baichuan 2: Open Large-scale Language Models
  22. RRHF: Rank Responses to Align Language Models with Human Feedback without tears
  23. GLM-130B: An Open Bilingual Pre-trained Model
  24. HuatuoGPT, towards Taming Language Model to Be a Doctor
  25. A survey of large language models
  26. Wei Zhu and Xiaoling Wang. 2023. Chatmed: A chinese medical large language model. https://github.com/michael-wzhu/ChatMed.

Show All 26