Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 147 tok/s
Gemini 2.5 Pro 40 tok/s Pro
GPT-5 Medium 28 tok/s Pro
GPT-5 High 24 tok/s Pro
GPT-4o 58 tok/s Pro
Kimi K2 201 tok/s Pro
GPT OSS 120B 434 tok/s Pro
Claude Sonnet 4.5 38 tok/s Pro
2000 character limit reached

Integrating Physician Diagnostic Logic into Large Language Models: Preference Learning from Process Feedback (2401.05695v2)

Published 11 Jan 2024 in cs.CL

Abstract: The use of LLMs in medical dialogue generation has garnered significant attention, with a focus on improving response quality and fluency. While previous studies have made progress in optimizing model performance for single-round medical Q&A tasks, there is a need to enhance the model's capability for multi-round conversations to avoid logical inconsistencies. To address this, we propose an approach called preference learning from process feedback~(PLPF), which integrates the doctor's diagnostic logic into LLMs. PLPF involves rule modeling, preference data generation, and preference alignment to train the model to adhere to the diagnostic process. Experimental results using Standardized Patient Testing show that PLPF enhances the diagnostic accuracy of the baseline model in medical conversations by 17.6%, outperforming traditional reinforcement learning from human feedback. Additionally, PLPF demonstrates effectiveness in both multi-round and single-round dialogue tasks, showcasing its potential for improving medical dialogue generation.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (26)
  1. Qwen technical report. arXiv preprint arXiv:2309.16609.
  2. Constitutional ai: Harmlessness from ai feedback. arXiv preprint arXiv:2212.08073.
  3. Disc-medllm: Bridging general large language models and real-world medical consultation. arXiv preprint arXiv:2308.14346.
  4. Huatuogpt-ii, one-stage training for medical adaption of llms. arXiv preprint arXiv:2311.09774.
  5. A benchmark for automatic medical consultation system: frameworks, tasks and datasets. Bioinformatics, 39(1):btac817.
  6. Plugmed: Improving specificity in patient-centered medical dialogue generation using in-context learning. In The 2023 Conference on Empirical Methods in Natural Language Processing.
  7. Glm: General language model pretraining with autoregressive blank infilling. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 320–335.
  8. Reinforced self-training (rest) for language modeling. arXiv preprint arXiv:2308.08998.
  9. Applying deep matching networks to chinese medical question answering: a study and a dataset. BMC medical informatics and decision making, 19(2):91–100.
  10. A survey of large language models for healthcare: from data, technology, and applications to accountability and ethics.
  11. Understanding the effects of rlhf on llm generalisation and diversity. arXiv preprint arXiv:2310.06452.
  12. Rlaif: Scaling reinforcement learning from human feedback with ai feedback.
  13. Meddg: An entity-centric medical consultation dataset for entity-aware medical dialogue generation.
  14. OpenAI. 2023. ChatGPT: A Large-Scale Open-Domain Chatbot. https://openai.com/chatgpt. Version 2.0.
  15. Direct preference optimization: Your language model is secretly a reward model.
  16. Salmon: Self-alignment with principle-following reward models. arXiv preprint arXiv:2310.05910.
  17. Clinicalgpt: Large language models finetuned with diverse medical data and comprehensive evaluation. arXiv preprint arXiv:2306.09968.
  18. Huatuo: Tuning llama model with chinese medical knowledge.
  19. JunYong Zhu WeiGuo Dong. 2012. Objective Structured Clinical Examination & Standardized Patients. People’s Medical Publishing House (PMPH), No. 19, Panjiayuan Nanli, Chaoyang District, Beijing, China.
  20. Doctorglm: Fine-tuning your chinese doctor is not a herculean task. arXiv preprint arXiv:2304.01097.
  21. Baichuan 2: Open large-scale language models. arXiv preprint arXiv:2309.10305.
  22. Rrhf: Rank responses to align language models with human feedback without tears. arXiv preprint arXiv:2304.05302.
  23. Glm-130b: An open bilingual pre-trained model. arXiv preprint arXiv:2210.02414.
  24. Huatuogpt, towards taming language models to be a doctor. arXiv preprint arXiv:2305.15075.
  25. A survey of large language models.
  26. Wei Zhu and Xiaoling Wang. 2023. Chatmed: A chinese medical large language model. https://github.com/michael-wzhu/ChatMed.
Citations (2)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 1 tweet and received 0 likes.

Upgrade to Pro to view all of the tweets about this paper: