Towards Causality-Aware Inferring: A Sequential Discriminative Approach for Medical Diagnosis (2003.06534v5)
Abstract: Medical diagnosis assistant (MDA) aims to build an interactive diagnostic agent to sequentially inquire about symptoms for discriminating diseases. However, since the dialogue records used to build a patient simulator are collected passively, the data might be deteriorated by some task-unrelated biases, such as the preference of the collectors. These biases might hinder the diagnostic agent to capture transportable knowledge from the simulator. This work attempts to address these critical issues in MDA by taking advantage of the causal diagram to identify and resolve two representative non-causal biases, i.e., (i) default-answer bias and (ii) distributional inquiry bias. Specifically, Bias (i) originates from the patient simulator which tries to answer the unrecorded inquiries with some biased default answers. Consequently, the diagnostic agents cannot fully demonstrate their advantages due to the biased answers. To eliminate this bias and inspired by the propensity score matching technique with causal diagram, we propose a propensity-based patient simulator to effectively answer unrecorded inquiry by drawing knowledge from the other records; Bias (ii) inherently comes along with the passively collected data, and is one of the key obstacles for training the agent towards "learning how" rather than "remembering what". For example, within the distribution of training data, if a symptom is highly coupled with a certain disease, the agent might learn to only inquire about that symptom to discriminate that disease, thus might not generalize to the out-of-distribution cases. To this end, we propose a progressive assurance agent, which includes the dual processes accounting for symptom inquiry and disease diagnosis respectively. The inquiry process is driven by the diagnosis process in a top-down manner to inquire about symptoms for enhancing diagnostic confidence.
- V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski et al., “Human-level control through deep reinforcement learning,” Nature, vol. 518, no. 7540, p. 529, 2015.
- S. R. Sutton, A. D. Mcallester, P. S. Singh, and Y. Mansour, “Policy gradient methods for reinforcement learning with function approximation,” in NeurIPS, 1999, pp. 1057–1063.
- L. Xu, Q. Zhou, K. Gong, X. Liang, J. Tang, and L. Lin, “End-to-end knowledge-routed relational dialogue system for automatic diagnosis,” in AAAI, 2019.
- Z. Wei, Q. Liu, B. Peng, H. Tou, T. Chen, X. Huang, K.-F. Wong, and X. Dai, “Task-oriented dialogue system for automatic diagnosis,” in ACL, vol. 2, 2018, pp. 201–207.
- S. R. Cole, R. W. Platt, E. F. Schisterman, H. Chu, D. Westreich, D. Richardson, and C. Poole, “Illustrating bias due to conditioning on a collider,” IJE, vol. 39, no. 2, pp. 417–420, 2010.
- D. B. Rubin, “Estimating causal effects of treatments in randomized and nonrandomized studies.” JEP, vol. 66, no. 5, p. 688, 1974.
- J. Neyman, “Sur les applications de la théorie des probabilités aux experiences agricoles: Essai des principes,” Roczniki Nauk Rolniczych, vol. 10, pp. 1–51, 1923.
- R. H. Dehejia and S. Wahba, “Propensity score-matching methods for nonexperimental causal studies,” in RES, vol. 84. MIT Press, 2002.
- P. C. Austin, “An introduction to propensity score methods for reducing the effects of confounding in observational studies,” MBR, vol. 46, no. 3, pp. 399–424, 2011.
- P. R. Rosenbaum and D. B. Rubin, “The central role of the propensity score in observational studies for causal effects,” in Biometrika. Oxford University Press, 1983.
- J. Pearl, “The do-calculus revisited,” in UAI, 2012.
- R. Ratcliff, P. L. Smith, S. D. Brown, and G. McKoon, “Diffusion decision model: Current issues and history,” Trends in cognitive sciences, vol. 20, no. 4, pp. 260–281, 2016.
- J. Pearl et al., “Causal inference in statistics: An overview,” Statistics surveys, vol. 3, pp. 96–146, 2009.
- K. A. Bollen and J. Pearl, “Eight myths about causality and structural equation models,” in Handbook of causal analysis for social research. Springer, 2013, pp. 301–328.
- L. Chen, H. Zhang, J. Xiao, X. He, S. Pu, and S.-F. Chang, “Counterfactual critic multi-agent training for scene graph generation,” in CVPR, 2019, pp. 4613–4623.
- T. Kaihua, N. Yulei, H. Jianqiang, S. Jiaxin, and Z. Hanwang, “Unbiased scene graph generation from biased training,” in CVPR, 2020.
- J. Qi, Y. Niu, J. Huang, and H. Zhang, “Two causal principles for improving visual dialog,” in CVPR, 2020, pp. 10 860–10 869.
- T. Wang, J. Huang, H. Zhang, and Q. Sun, “Visual commonsense r-cnn,” in CVPR, 2020, pp. 10 760–10 770.
- E. Abbasnejad, D. Teney, A. Parvaneh, J. Shi, and A. v. d. Hengel, “Counterfactual vision and language learning,” in CVPR, 2020.
- C. Yu, J. Liu, S. Nemati, and G. Yin, “Reinforcement learning in healthcare: A survey,” ACM Computing Surveys (CSUR), vol. 55, no. 1, pp. 1–36, 2021.
- I. Dasgupta, X. J. Wang, S. Chiappa, J. Mitrovic, A. P. Ortega, D. Raposo, E. Hughes, P. Battaglia, M. Botvinick, and Z. Kurth-Nelson, “Causal reasoning from meta-reinforcement learning,” in ICLR, 2019.
- M. Oberst and D. Sontag, “Counterfactual off-policy evaluation with gumbel-max structural causal models,” in ICML, 2019.
- N. J. Foerster, G. Farquhar, T. Afouras, N. Nardelli, and S. Whiteson, “Counterfactual multi-agent policy gradients,” in NCAI, 2018.
- E. Bareinboim and J. Pearl, “Causal inference and the data-fusion problem,” in Proceedings of the National Academy of Sciences, vol. 113. National Academy Sciences, 2016, pp. 7345–7352.
- J. Schatzmann, B. Thomson, K. Weilhammer, H. Ye, and S. Young, “Agenda-based user simulation for bootstrapping a pomdp dialogue system,” in ACL. ACL, 2007, pp. 149–152.
- X. Li, Z. Lipton, B. Dhingra, L. Li, J. Gao, and Y.-N. Chen, “A user simulator for task-completion dialogues,” in arXiv:1612.05688, 12 2016.
- B. Peng, X. Li, J. Gao, J. Liu, and K.-F. Wong, “Integrating planning for task-completion dialogue policy learning,” in ACL, 2018.
- H. Chen, X. Liu, D. Yin, and J. Tang, “A survey on dialogue systems: Recent advances and new frontiers,” in SIGKDD. ACM, 2017.
- E. A. Stuart, “Matching methods for causal inference: A review and a look forward,” SS, vol. 25, no. 1, p. 1, 2010.
- M. Jakubowski, “Latent variables and propensity score matching: a simulation study with application to data from the programme for international student assessment in poland,” Empirical Economics, vol. 48, no. 3, pp. 1287–1325, 2015.
- K. He, X. Chen, S. Xie, Y. Li, P. Dollár, and R. Girshick, “Masked autoencoders are scalable vision learners,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 16 000–16 009.
- Z. Lipton, X. Li, J. Gao, L. Li, F. Ahmed, and L. Deng, “Bbq-networks: Efficient exploration in deep reinforcement learning for task-oriented dialogue systems,” in AAAI, 2018.
- X. Li, Y.-N. Chen, L. Li, J. Gao, and A. Celikyilmaz, “End-to-end task-completion neural dialogue systems,” in IJCNLP, vol. 1, 2017.
- A. Madotto, C.-S. Wu, and P. Fung, “Mem2seq: Effectively incorporating knowledge bases into end-to-end task-oriented dialog systems.” in ACL, 2018, pp. 1468–1478.
- C.-S. Wu, R. Socher, and C. Xiong, “Global-to-local memory pointer networks for task-oriented dialogue,” in ICLR, 2019.
- W. Lei, X. Jin, M.-Y. Kan, Z. Ren, X. He, and D. Yin, “Sequicity: Simplifying task-oriented dialogue systems with single sequence-to-sequence architectures,” in ACL, vol. 1, 2018, pp. 1437–1447.
- K.-F. Tang, H.-C. Kao, C.-N. Chou, and E. Y. Chang, “Inquire and diagnose: Neural symptom checking ensemble using deep reinforcement learning,” in NeurIPS, 2016.
- H.-C. Kao, K.-F. Tang, and E. Y. Chang, “Context-aware symptom checking for disease diagnosis using hierarchical reinforcement learning,” in AI, 2018.
- Y.-S. Peng, K.-F. Tang, H.-T. Lin, and E. Chang, “Refuel: Exploring sparse features in deep reinforcement learning for fast disease diagnosis,” in NeurIPS, 2018, pp. 7333–7342.
- R. Ratcliff and G. McKoon, “The Diffusion Decision Model: Theory and Data for Two-Choice Decision Tasks,” Neural Computation, vol. 20, no. 4, pp. 873–922, 04 2008.
- Y. Xia, J. Zhou, Z. Shi, C. Lu, and H. Huang, “Generative adversarial regularized mutual information policy gradient framework for automatic diagnosis,” in AAAI, 2020, pp. 1062–1069.
- T. J. VanderWeele, “A three-way decomposition of a total effect into direct, indirect, and interactive effects,” Epidemiology (Cambridge, Mass.), vol. 24, no. 2, p. 224, 2013.
- J. Pearl, “Causal diagrams for empirical research. biometrika,” in MathSciNet, vol. 860, 1995.
- Y. Huang and M. Valtorta, “Pearl’s calculus of intervention is complete,” in CUAI, 2012.
- I. Shpitser and J. Pearl, “Identification of conditional interventional distributions,” in CUAI, 2012.
- R. McAllister, G. Kahn, J. Clune, and S. Levine, “Robustness to out-of-distribution inputs via task-aware generative uncertainty,” in ICRA, 2019.
- J. Pearl and E. Bareinboim, “External validity: From do-calculus to transportability across populations,” Statistical Science, pp. 579–595, 2014.
- P. Whaite and F. P. Ferrie, “Autonomous exploration: Driven by uncertainty,” TPAMI, vol. 19, no. 3, pp. 193–205, 1997.
- P. Whaite and F. P. Ferrie, “From uncertainty to visual exploration,” TPAMI, vol. 13, no. 10, pp. 1038–1049, 1991.
- I. Osband, J. Aslanides, and A. Cassirer, “Randomized prior functions for deep reinforcement learning,” in NeurIPS, 2018.
- Y. Burda, H. Edwards, A. Storkey, and O. Klimov, “Exploration by random network distillation,” in ICLR, 2019.
- L. Brunke, M. Greeff, A. W. Hall, Z. Yuan, S. Zhou, J. Panerati, and A. P. Schoellig, “Safe learning in robotics: From learning-based control to safe reinforcement learning,” Annual Review of Control, Robotics, and Autonomous Systems, vol. 5, pp. 411–444, 2022.
- Z. Wang, T. Schaul, M. Hessel, H. Van Hasselt, M. Lanctot, and N. De Freitas, “Dueling network architectures for deep reinforcement learning,” in ICML, 2016.
- V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. Riedmiller, “Playing atari with deep reinforcement learning,” in CS, 2013.
- J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint arXiv:1810.04805, 2018.