Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Natural Language Processing Methods to Identify Oncology Patients at High Risk for Acute Care with Clinical Notes (2209.13860v2)

Published 28 Sep 2022 in cs.CL and cs.LG

Abstract: Clinical notes are an essential component of a health record. This paper evaluates how NLP can be used to identify the risk of acute care use (ACU) in oncology patients, once chemotherapy starts. Risk prediction using structured health data (SHD) is now standard, but predictions using free-text formats are complex. This paper explores the use of free-text notes for the prediction of ACU instead of SHD. Deep Learning models were compared to manually engineered language features. Results show that SHD models minimally outperform NLP models; an l1-penalised logistic regression with SHD achieved a C-statistic of 0.748 (95%-CI: 0.735, 0.762), while the same model with language features achieved 0.730 (95%-CI: 0.717, 0.745) and a transformer-based model achieved 0.702 (95%-CI: 0.688, 0.717). This paper shows how LLMs can be used in clinical applications and underlines how risk bias is different for diverse patient groups, even using only free-text data.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (30)
  1. Acute hospital care is the chief driver of regional spending variation in Medicare patients with advanced cancer. Health Aff (Millwood). 2014 Oct;33(10):1793-800.
  2. Cost of care for elderly cancer patients in the United States. J Natl Cancer Inst. 2008 May;100(9):630-41.
  3. Emergency department (ED) utilization and hospital admission rates among oncology patients at a large academic center and the need for improved urgent care access. Journal of Clinical Oncology. 2014;32(30_suppl):19-9. PMID: 28141471. Available from: https://doi.org/10.1200/jco.2014.32.30_suppl.19.
  4. Clinician assessment of potentially avoidable hospitalization in patients with cancer. Journal of clinical oncology : official journal of the American Society of Clinical Oncology. 2014;32 30_suppl:4.
  5. Machine Learning Applied to Electronic Health Records: Identification of Chemotherapy Patients at High Risk for Preventable Emergency Department Visits and Hospital Admissions. JCO Clinical Cancer Informatics. 2021;(5):1106-26. PMID: 34752139. Available from: https://doi.org/10.1200/CCI.21.00116.
  6. A Clinical Prediction Model to Assess Risk for Chemotherapy-Related Hospitalization in Patients Initiating Palliative Chemotherapy. JAMA Oncology. 2015 07;1(4):441-7. Available from: https://doi.org/10.1001/jamaoncol.2015.0828.
  7. Development and validation of a score to predict acute care use after initiation of systemic therapy for cancer. JAMA Network Open. 2019;2.
  8. A framework for building a clinically relevant risk model. Journal of Clinical Oncology. 2019.
  9. Office of the National Coordinator for Health Information Technology: National trends in hospital and physician adoption of electronic health records; 2022. https://www.healthit.gov/data/quickstats/national-trends-hospital-and-physician-adoption-electronic-health-records.
  10. Validation of prediction models for critical care outcomes using natural language processing of electronic health record data. JAMA Netw Open. 2018 Dec;1(8):e185097.
  11. N-gram support vector machines for scalable procedure and diagnosis classification, with applications to clinical free text data from the intensive care unit. J Am Med Inform Assoc. 2014 Sep;21(5):871-5.
  12. Prediction of stroke outcome using natural language processing-based machine learning of radiology report of brain MRI. J Pers Med. 2020 Dec;10(4):286.
  13. Deep learning in clinical natural language processing: a methodical review. J Am Med Inform Assoc. 2020 Mar;27(3):457-70.
  14. Publicly available clinical bert embeddings. In: Proceedings of the 2nd Clinical Natural Language Processing Workshop. Minneapolis, Minnesota, USA: Association for Computational Linguistics; 2019. p. 72-8. Available from: https://www.aclweb.org/anthology/W19-1909.
  15. ClinicalBERT: modeling clinical notes and predicting hospital readmission. arXiv:190405342. 2019.
  16. Using deep learning-based natural language processing to identify reasons for statin nonuse in patients with atherosclerotic cardiovascular disease. Communications Medicine. 2022;2.
  17. 2019 chemotherapy measure facts admissions and emergency department (ED) visits for patients receiving outpatient chemotherapy hospital outpatient quality reporting (OQR) program (OP-35);. https://qualitynet.cms.gov/files/5dcc6762a3e7610023518e23?filename=CY21_OQRChemoMsr_FactSheet.pdf.
  18. Launching into clinical space with medspaCy: a new clinical text processing toolkit in Python. In: AMIA Annual Symposium Proceedings 2021; (in press, n.d.). Available from: http://arxiv.org/abs/2106.07799.
  19. Miller GA. WordNet: A Lexical Database for English. Commun ACM. 1992;38:39-41.
  20. Attention is all You need; 2017. Available from: https://arxiv.org/pdf/1706.03762.pdf.
  21. BERT: pre-training of deep bidirectional transformers for language understanding. ArXiv. 2019;abs/1810.04805.
  22. On the consistency of ordinal regression methods. J Mach Learn Res. 2017;18:55:1-55:35.
  23. Rosenthal E. Spacecutter: ordinal regression models in pytorch; 2018. Available from: https://www.ethanrosenthal.com/2018/12/06/spacecutter-ordinal-regression.
  24. A calibration hierarchy for risk models was defined: from utopia to empirical data. J Clin Epidemiol. 2016 Jun;74:167-76.
  25. A simple, step-by-step guide to interpreting decision curve analysis. Diagnostic and Prognostic Research. 2019;3.
  26. Kaplan EL, Meier P. Nonparametric estimation from incomplete observations. J Am Stat Assoc. 1958 Jun;53(282):457.
  27. Longformer: The long-document transformer. arXiv:200405150. 2020.
  28. Using clinical notes with time series data for ICU management. ArXiv. 2019;abs/1909.09702.
  29. Scikit-learn: machine learning in python. Journal of machine learning research. 2011;12(Oct):2825-30.
  30. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In: Advances in Neural Information Processing Systems 32. Curran Associates, Inc.; 2019. p. 8024-35. Available from: http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf.
Citations (5)

Summary

We haven't generated a summary for this paper yet.