Papers
Topics
Authors
Recent
Search
2000 character limit reached

Understanding Missingness in Time-series Electronic Health Records for Individualized Representation

Published 24 Feb 2024 in cs.LG | (2402.15730v1)

Abstract: With the widespread of machine learning models for healthcare applications, there is increased interest in building applications for personalized medicine. Despite the plethora of proposed research for personalized medicine, very few focus on representing missingness and learning from the missingness patterns in time-series Electronic Health Records (EHR) data. The lack of focus on missingness representation in an individualized way limits the full utilization of machine learning applications towards true personalization. In this brief communication, we highlight new insights into patterns of missingness with real-world examples and implications of missingness in EHRs. The insights in this work aim to bridge the gap between theoretical assumptions and practical observations in real-world EHRs. We hope this work will open new doors for exploring directions for better representation in predictive modelling for true personalization.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (23)
  1. Personalized medicine and the power of electronic health records. \JournalTitleCell 177, 58–69 (2019).
  2. Chen, P. et al. Deep representation learning for individualized treatment effect estimation using electronic health records. \JournalTitleJournal of biomedical informatics 100, 103303 (2019).
  3. Learning optimal individualized treatment rules from electronic health record data. In 2016 IEEE International Conference on Healthcare Informatics (ICHI), 65–71 (IEEE, 2016).
  4. Missing clinical and behavioral health data in a large electronic health record (ehr) system. \JournalTitleJournal of the American Medical Informatics Association 23, 1143–1149 (2016).
  5. Johnson, A. E. et al. Mimic-iii, a freely accessible critical care database. \JournalTitleScientific data 3, 1–9 (2016).
  6. Pollard, T. J. et al. The eicu collaborative research database, a freely available multi-center database for critical care research. \JournalTitleScientific data 5, 1–13 (2018).
  7. Gain: Missing data imputation using generative adversarial nets. In International Conference on Machine Learning, 5689–5698 (PMLR, 2018).
  8. Cao, W. et al. Brits: Bidirectional recurrent imputation for time series. \JournalTitlearXiv preprint arXiv:1805.10572 (2018).
  9. Gp-vae: Deep probabilistic time series imputation. In International conference on artificial intelligence and statistics, 1651–1661 (PMLR, 2020).
  10. Types of missing data. agency for healthcare research and quality (us). 2018 (2022).
  11. The effects of the irregular sample and missing data in time series analysis. \JournalTitleNonlinear dynamics, psychology, and life sciences (2006).
  12. Benefits and risks of mri in pregnancy. In Seminars in perinatology, 301–304 (Elsevier, 2013).
  13. Kim, J. et al. The dangers of parathyroid biopsy. \JournalTitleJournal of Otolaryngology-Head & Neck Surgery 46, 1–4 (2017).
  14. Biases in electronic health record data due to processes within the healthcare system: retrospective observational study. \JournalTitleBmj 361 (2018).
  15. A survey of generative adversarial networks for synthesizing structured electronic health records. \JournalTitleACM Computing Surveys (2023).
  16. Duff, C. J. et al. The frequency of testing for glycated haemoglobin, hba1c, is linked to the probability of achieving target levels in patients with suboptimally controlled diabetes mellitus. \JournalTitleClinical Chemistry and Laboratory Medicine (CCLM) 57, 296–304 (2019).
  17. Weber, A. M. et al. Gender-related data missingness, imbalance and bias in global health surveys. \JournalTitleBMJ global health 6, e007405 (2021).
  18. Strategies for handling missing data in electronic health record derived data. \JournalTitleEgems 1 (2013).
  19. A new insight into missing data in intensive care unit patient profiles: observational study. \JournalTitleJMIR medical informatics 7, e11605 (2019).
  20. Recurrent neural networks for multivariate time series with missing values. \JournalTitleScientific reports 8, 1–12 (2018).
  21. The utility of troponin measurement to detect myocardial infarction: review of the current findings. \JournalTitleVascular health and risk management 6, 691 (2010).
  22. Saits: Self-attention-based imputation for time series. \JournalTitleExpert Systems with Applications 219, 119619 (2023).
  23. Ignite: Individualized generation of imputations in time-series electronic health records. \JournalTitlearXiv preprint arXiv:2401.04402 (2024).

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.